Yale N. Patt Sanjay J. Patel

Slides:



Advertisements
Similar presentations
PART 6 Programming 1.Decomposition 2.Three Constructs 3.Counting Example 4.Debugging.
Advertisements

Chapter 3 Digital Logic Structures. 3-2 Transistor: Building Block of Computers Microprocessors contain millions of transistors Intel Pentium 4 (2000):
Chapter 5 The LC-3.
Chapter 4 - ISA 1.The Von Neumann Model. 4-2 The Stored Program Computer 1943: ENIAC Presper Eckert and John Mauchly -- first general electronic computer.
LC-3 Instruction Set Architecture (Textbook’s Chapter 5)
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1-1 Introduction to Computing Systems: From Bits and Gates.
LC-3 Instruction Set Architecture
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1-1 Introduction to Computing Systems: From Bits and Gates.
Chapter 4 The Von Neumann Model
Chapter 5 The LC Instruction Set Architecture ISA = All of the programmer-visible components and operations of the computer memory organization.
Introduction to Computer Engineering CS/ECE 252, Fall 2009 Prof. Mark D. Hill Computer Sciences Department University of Wisconsin – Madison.
The LC-3. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 5-2 Instruction Set Architecture ISA = All of the.
Chapter 5 The LC Instruction Set Architecture ISA = All of the programmer-visible components and operations of the computer memory organization.
Chapter 3 Digital Logic Structures. 3-2 Combinational vs. Sequential Combinational Circuit always gives the same output for a given set of inputs  ex:
Chapter 3 Digital Logic Structures. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 3-2 Transistor: Building.
Chapter 6 Programming. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 6-2 Solving Problems using a Computer.
Von Neumann Model Computer Organization I 1 September 2009 © McQuain, Feng & Ribbens The Stored Program Computer 1945: John von Neumann –
Chapter 4 The Von Neumann Model
Introduction to Computing Systems and Programming The LC-2.
Introduction to Computing Systems and Programming Programming.
CS150: Computer Organization and Architecture Michael D. Wilder, Ph.D.
Chapter 1 Welcome Aboard. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1-2 Introduction to the World of.
Yale N. Patt Sanjay J. Patel
Bits, Data Types, and Operations
Chapter 2 Bits, Data Types, and Operations
Introduction to Computer Engineering
Instructor:Po-Yu Kuo 教師:郭柏佑
Chapter 4 The Von Neumann Model
Instructor:Po-Yu Kuo 教師:郭柏佑
Introduction to Computer Engineering
Introduction to Computer Engineering
COSC121: Computer Systems: Review
Chapter 4 The Von Neumann Model
Introduction to Computer Engineering
Chapter 4 The Von Neumann Model
Chapter 5 The LC-3.
Welcome Aboard 1.
Chapter 4 The Von Neumann Model
LC-3 Details and Examples
Chapter 6 Programming.
Chapter 3 Digital Logic Structures
Chapter 5 The LC-3.
Introduction to Computer Engineering
Introduction to Computer Engineering
Instructor:Po-Yu Kuo 教師:郭柏佑
Instructor:Po-Yu Kuo 教師:郭柏佑
Introduction to Computer Engineering
Introduction to Computer Engineering
Chapter 6 Programming.
Computer Architecture
Introduction to Computer Engineering
Chapter 4 The Von Neumann Model
School of Computer Science and Technology
The Stored Program Computer
Introduction to Computer Engineering
COSC121: Computer Systems
Introduction to Computer Engineering
Introduction to Computer Engineering
Introduction to Computer Engineering
Introduction to Computer Engineering
Introduction to Computer Engineering
Introduction to Computer Engineering
Introduction to Computer Engineering
Introduction to Computer Engineering
Introduction to Computer Engineering
Chapter 6 Programming.
Chapter 2 Bits, Data Types, and Operations
Chapter 4 The Von Neumann Model
Chapter 5 The LC-3.
Presentation transcript:

Yale N. Patt Sanjay J. Patel Introduction to Computing Systems: From Bits and Gates to C and Beyond 2nd Edition Yale N. Patt Sanjay J. Patel Slides prepared by Gregory T. Byrd, North Carolina State University

Chapter 1 Welcome Aboard

Introduction to the World of Computing Computer: electronic genius? NO! Electronic idiot! Does exactly what we tell it to, nothing more. Goal of the course: You will be able to write programs in C and understand what’s going on underneath. Approach: Build understanding from the bottom up. Bits  Gates  Processor  Instructions  C Programming

Two Recurring Themes Abstraction Hardware vs. Software Productivity enhancer – don’t need to worry about details… Can drive a car without knowing how the internal combustion engine works. …until something goes wrong! Where’s the dipstick? What’s a spark plug? Important to understand the components and how they work together. Hardware vs. Software It’s not either/or – both are components of a computer system. Even if you specialize in one, you should understand capabilities and limitations of both.

Big Idea #1: Universal Computing Device All computers, given enough time and memory, are capable of computing exactly the same things. = = PDA Workstation Supercomputer

Tadd Tmul Turing Machine Mathematical model of a device that can perform any computation – Alan Turing (1937) ability to read/write symbols on an infinite “tape” state transitions, based on current state and symbol Every computation can be performed by some Turing machine. (Turing’s thesis) Tadd a,b a+b Turing machine that adds Tmul a,b ab Turing machine that multiplies For more info about Turing machines, see http://www.wikipedia.org/wiki/Turing_machine/ For more about Alan Turing, see http://www.turing.org.uk/turing/

Universal Turing Machine A machine that can implement all Turing machines -- this is also a Turing machine! inputs: data, plus a description of computation (other TMs) U a,b,c c(a+b) Universal Turing Machine Tadd, Tmul U is programmable – so is a computer! instructions are part of the input data a computer can emulate a Universal Turing Machine A computer is a universal computing device.

From Theory to Practice In theory, computer can compute anything that’s possible to compute given enough memory and time In practice, solving problems involves computing under constraints. time weather forecast, next frame of animation, ... cost cell phone, automotive engine controller, ... power cell phone, handheld video game, ...

Big Idea #2: Transformations Between Layers Problems Algorithms Language Instruction Set Architecture Microarchitecture Circuits Devices

How do we solve a problem using a computer? A systematic sequence of transformations between layers of abstraction. Problem Software Design: choose algorithms and data structures Algorithm Programming: use language to express design Program Compiling/Interpreting: convert language to machine instructions Instr Set Architecture

Deeper and Deeper… Processor Design: Instr Set Architecture Processor Design: choose structures to implement ISA Microarch Logic/Circuit Design: gates and low-level circuits to implement components Circuits Process Engineering & Fabrication: develop and manufacture lowest-level components Devices

Descriptions of Each Level Problem Statement stated using "natural language" may be ambiguous, imprecise Algorithm step-by-step procedure, guaranteed to finish definiteness, effective computability, finiteness Program express the algorithm using a computer language high-level language, low-level language Instruction Set Architecture (ISA) specifies the set of instructions the computer can perform data types, addressing mode

Descriptions of Each Level (cont.) Microarchitecture detailed organization of a processor implementation different implementations of a single ISA Logic Circuits combine basic operations to realize microarchitecture many different ways to implement a single function (e.g., addition) Devices properties of materials, manufacturability

Many Choices at Each Level Solve a system of equations Gaussian elimination Jacobi iteration Red-black SOR Multigrid FORTRAN C C++ Java Intel x86 PowerPC Atmel AVR Centrino Pentium 4 Xeon Ripple-carry adder Carry-lookahead adder CMOS Bipolar GaAs Tradeoffs: cost performance power (etc.) Sun and Java are trademarks of Sun Microsystems, Inc. Intel, Pentium, Centrino, and Xeon are trademarks of Intel Corporation. AMD and Athlon and trademarks of Advanced Micro Devices, Inc. Atmel and AVR are registered trademarks of Atmel Corporation. PowerPC is a trademark of International Business Machines Corporation.

Course Outline Bits and Bytes Digital Logic How do we represent information using electrical signals? Digital Logic How do we build circuits to process information? Processor and Instruction Set How do we build a processor out of logic elements? What operations (instructions) will we implement? Assembly Language Programming How do we use processor instructions to implement algorithms? How do we write modular, reusable code? (subroutines) I/O, Traps, and Interrupts How does processor communicate with outside world? C Programming How do we write programs in C? How do we implement high-level programming constructs?

Chapter 2 Bits, Data Types, and Operations

How do we represent data in a computer? At the lowest level, a computer is an electronic machine. works by controlling the flow of electrons Easy to recognize two conditions: presence of a voltage – we’ll call this state “1” absence of a voltage – we’ll call this state “0” Could base state on value of voltage, but control and detection circuits more complex. compare turning on a light switch to measuring or regulating voltage

Computer is a binary digital system. finite number of symbols Binary (base two) system: has two states: 0 and 1 Basic unit of information is the binary digit, or bit. Values with more than two states require multiple bits. A collection of two bits has four possible states: 00, 01, 10, 11 A collection of three bits has eight possible states: 000, 001, 010, 011, 100, 101, 110, 111 A collection of n bits has 2n possible states.

What kinds of data do we need to represent? Numbers – signed, unsigned, integers, floating point, complex, rational, irrational, … Text – characters, strings, … Images – pixels, colors, shapes, … Sound Logical – true, false Instructions … Data type: representation and operations within the computer We’ll start with numbers…

329 101 Unsigned Integers 102 101 100 22 21 20 Non-positional notation could represent a number (“5”) with a string of ones (“11111”) problems? Weighted positional notation like decimal numbers: “329” “3” is worth 300, because of its position, while “9” is only worth 9 most significant least significant 329 102 101 100 101 22 21 20 3x100 + 2x10 + 9x1 = 329 1x4 + 0x2 + 1x1 = 5

Unsigned Integers (cont.) An n-bit unsigned integer represents 2n values: from 0 to 2n-1. 22 21 20 1 2 3 4 5 6 7

Unsigned Binary Arithmetic Base-2 addition – just like base-10! add from right to left, propagating carry carry 10010 10010 1111 + 1001 + 1011 + 1 11011 11101 10000 10111 + 111 Subtraction, multiplication, division,…

Signed Integers With n bits, we have 2n distinct values. assign about half to positive integers (1 through 2n-1) and about half to negative (- 2n-1 through -1) that leaves two values: one for 0, and one extra Positive integers just like unsigned – zero in most significant (MS) bit 00101 = 5 Negative integers sign-magnitude – set MS bit to show negative, other bits are the same as unsigned 10101 = -5 one’s complement – flip every bit to represent negative 11010 = -5 in either case, MS bit indicates sign: 0=positive, 1=negative

Two’s Complement + 11011 (-5) + (-9) 00000 (0) 00000 (0) Problems with sign-magnitude and 1’s complement two representations of zero (+0 and –0) arithmetic circuits are complex How to add two sign-magnitude numbers? e.g., try 2 + (-3) How to add to one’s complement numbers? e.g., try 4 + (-3) Two’s complement representation developed to make circuits easy for arithmetic. for each positive number (X), assign value to its negative (-X), such that X + (-X) = 0 with “normal” addition, ignoring carry out 00101 (5) 01001 (9) + 11011 (-5) + (-9) 00000 (0) 00000 (0)

Two’s Complement Representation If number is positive or zero, normal binary representation, zeroes in upper bit(s) If number is negative, start with positive number flip every bit (i.e., take the one’s complement) then add one 00101 (5) 01001 (9) 11010 (1’s comp) (1’s comp) + 1 + 1 11011 (-5) (-9)

Two’s Complement Shortcut To take the two’s complement of a number: copy bits from right to left until (and including) the first “1” flip remaining bits to the left 011010000 011010000 100101111 (1’s comp) + 1 100110000 100110000 (flip) (copy)

Two’s Complement Signed Integers MS bit is sign bit – it has weight –2n-1. Range of an n-bit number: -2n-1 through 2n-1 – 1. The most negative number (-2n-1) has no positive counterpart. -23 22 21 20 1 2 3 4 5 6 7 -23 22 21 20 1 -8 -7 -6 -5 -4 -3 -2 -1

Converting Binary (2’s C) to Decimal If leading bit is one, take two’s complement to get a positive number. Add powers of 2 that have “1” in the corresponding bit positions. If original number was negative, add a minus sign. n 2n 1 2 4 3 8 16 5 32 6 64 7 128 256 9 512 10 1024 X = 01101000two = 26+25+23 = 64+32+8 = 104ten Assuming 8-bit 2’s complement numbers.

More Examples X = 00100111two = 25+22+21+20 = 32+4+2+1 = 39ten = 25+22+21+20 = 32+4+2+1 = 39ten n 2n 1 2 4 3 8 16 5 32 6 64 7 128 256 9 512 10 1024 X = 11100110two -X = 00011010 = 24+23+21 = 16+8+2 = 26ten X = -26ten Assuming 8-bit 2’s complement numbers.

Converting Decimal to Binary (2’s C) First Method: Division Find magnitude of decimal number. (Always positive.) Divide by two – remainder is least significant bit. Keep dividing by two until answer is zero, writing remainders from right to left. Append a zero as the MS bit; if original number was negative, take two’s complement. X = 104ten 104/2 = 52 r0 bit 0 52/2 = 26 r0 bit 1 26/2 = 13 r0 bit 2 13/2 = 6 r1 bit 3 6/2 = 3 r0 bit 4 3/2 = 1 r1 bit 5 X = 01101000two 1/2 = 0 r1 bit 6

Converting Decimal to Binary (2’s C) 1 2 4 3 8 16 5 32 6 64 7 128 256 9 512 10 1024 Second Method: Subtract Powers of Two Find magnitude of decimal number. Subtract largest power of two less than or equal to number. Put a one in the corresponding bit position. Keep subtracting until result is zero. Append a zero as MS bit; if original was negative, take two’s complement. X = 104ten 104 - 64 = 40 bit 6 40 - 32 = 8 bit 5 8 - 8 = 0 bit 3 X = 01101000two

Operations: Arithmetic and Logical Recall: a data type includes representation and operations. We now have a good representation for signed integers, so let’s look at some arithmetic operations: Addition Subtraction Sign Extension We’ll also look at overflow conditions for addition. Multiplication, division, etc., can be built from these basic operations. Logical operations are also useful: AND OR NOT

Addition + 11110000 (-16) + (-9) 01011000 (98) (-19) As we’ve discussed, 2’s comp. addition is just binary addition. assume all integers have the same number of bits ignore carry out for now, assume that sum fits in n-bit 2’s comp. representation 01101000 (104) 11110110 (-10) + 11110000 (-16) + (-9) 01011000 (98) (-19) Assuming 8-bit 2’s complement numbers.

Subtraction - 00010000 (16) - (-9) + 11110000 (-16) + (9) Negate subtrahend (2nd no.) and add. assume all integers have the same number of bits ignore carry out for now, assume that difference fits in n-bit 2’s comp. representation 01101000 (104) 11110110 (-10) - 00010000 (16) - (-9) + 11110000 (-16) + (9) 01011000 (88) (-1) Assuming 8-bit 2’s complement numbers.

Sign Extension To add two numbers, we must represent them with the same number of bits. If we just pad with zeroes on the left: Instead, replicate the MS bit -- the sign bit: 4-bit 8-bit 0100 (4) 00000100 (still 4) 1100 (-4) 00001100 (12, not -4) 4-bit 8-bit 0100 (4) 00000100 (still 4) 1100 (-4) 11111100 (still -4)

Overflow + 01001 (9) + 10111 (-9) 10001 (-15) 01111 (+15) If operands are too big, then sum cannot be represented as an n-bit 2’s comp number. We have overflow if: signs of both operands are the same, and sign of sum is different. Another test -- easy for hardware: carry into MS bit does not equal carry out 01000 (8) 11000 (-8) + 01001 (9) + 10111 (-9) 10001 (-15) 01111 (+15)

Logical Operations Operations on logical TRUE or FALSE two states -- takes one bit to represent: TRUE=1, FALSE=0 View n-bit number as a collection of n logical values operation applied to each bit independently A B A AND B 1 A B A OR B 1 A NOT A 1

Examples of Logical Operations AND useful for clearing bits AND with zero = 0 AND with one = no change OR useful for setting bits OR with zero = no change OR with one = 1 NOT unary operation -- one argument flips every bit 11000101 AND 00001111 00000101 11000101 OR 00001111 11001111 NOT 11000101 00111010

Hexadecimal Notation It is often convenient to write binary (base-2) numbers as hexadecimal (base-16) numbers instead. fewer digits -- four bits per hex digit less error prone -- easy to corrupt long string of 1’s and 0’s Binary Hex Decimal 0000 0001 1 0010 2 0011 3 0100 4 0101 5 0110 6 0111 7 Binary Hex Decimal 1000 8 1001 9 1010 A 10 1011 B 11 1100 C 12 1101 D 13 1110 E 14 1111 F 15

Converting from Binary to Hexadecimal Every four bits is a hex digit. start grouping from right-hand side 011101010001111010011010111 3 A 8 F 4 D 7 This is not a new machine representation, just a convenient way to write the number.

Fractions: Fixed-Point How can we represent fractions? Use a “binary point” to separate positive from negative powers of two -- just like “decimal point.” 2’s comp addition and subtraction still work. if binary points are aligned 00101000.101 (40.625) + 11111110.110 (-1.25) 00100111.011 (39.375) 2-1 = 0.5 2-2 = 0.25 2-3 = 0.125 No new operations -- same as integer arithmetic.

Very Large and Very Small: Floating-Point Large values: 6.023 x 1023 -- requires 79 bits Small values: 6.626 x 10-34 -- requires >110 bits Use equivalent of “scientific notation”: F x 2E Need to represent F (fraction), E (exponent), and sign. IEEE 754 Floating-Point Standard (32-bits): 1b 8b 23b S Exponent Fraction Exponent = 255 used for special values: If Fraction is non-zero, NaN (not a number). If Fraction is zero and sign is 0, positive infinity. If Fraction is zero and sign is 1, negative infinity.

Floating Point Example Single-precision IEEE floating point number: 10111111010000000000000000000000 Sign is 1 – number is negative. Exponent field is 01111110 = 126 (decimal). Fraction is 0.100000000000… = 0.5 (decimal). Value = -1.5 x 2(126-127) = -1.5 x 2-1 = -0.75. sign exponent fraction

Floating-Point Operations Will regular 2’s complement arithmetic work for Floating Point numbers? (Hint: In decimal, how do we compute 3.07 x 1012 + 9.11 x 108?)

Text: ASCII Characters ASCII: Maps 128 characters to 7-bit code. both printable and non-printable (ESC, DEL, …) characters 00 nul 10 dle 20 sp 30 40 @ 50 P 60 ` 70 p 01 soh 11 dc1 21 ! 31 1 41 A 51 Q 61 a 71 q 02 stx 12 dc2 22 " 32 2 42 B 52 R 62 b 72 r 03 etx 13 dc3 23 # 33 3 43 C 53 S 63 c 73 s 04 eot 14 dc4 24 $ 34 4 44 D 54 T 64 d 74 t 05 enq 15 nak 25 % 35 5 45 E 55 U 65 e 75 u 06 ack 16 syn 26 & 36 6 46 F 56 V 66 f 76 v 07 bel 17 etb 27 ' 37 7 47 G 57 W 67 g 77 w 08 bs 18 can 28 ( 38 8 48 H 58 X 68 h 78 x 09 ht 19 em 29 ) 39 9 49 I 59 Y 69 i 79 y 0a nl 1a sub 2a * 3a : 4a J 5a Z 6a j 7a z 0b vt 1b esc 2b + 3b ; 4b K 5b [ 6b k 7b { 0c np 1c fs 2c , 3c < 4c L 5c \ 6c l 7c | 0d cr 1d gs 2d - 3d = 4d M 5d ] 6d m 7d } 0e so 1e rs 2e . 3e > 4e N 5e ^ 6e n 7e ~ 0f si 1f us 2f / 3f ? 4f O 5f _ 6f o 7f del

Interesting Properties of ASCII Code What is relationship between a decimal digit ('0', '1', …) and its ASCII code? What is the difference between an upper-case letter ('A', 'B', …) and its lower-case equivalent ('a', 'b', …)? Given two ASCII characters, how do we tell which comes first in alphabetical order? Are 128 characters enough? (http://www.unicode.org/) No new operations -- integer arithmetic and logic.

Other Data Types Text strings Image Sound sequence of characters, terminated with NULL (0) typically, no hardware support Image array of pixels monochrome: one bit (1/0 = black/white) color: red, green, blue (RGB) components (e.g., 8 bits each) other properties: transparency hardware support: typically none, in general-purpose processors MMX -- multiple 8-bit operations on 32-bit word Sound sequence of fixed-point numbers

LC-3 Data Types Some data types are supported directly by the instruction set architecture. For LC-3, there is only one hardware-supported data type: 16-bit 2’s complement signed integer Operations: ADD, AND, NOT Other data types are supported by interpreting 16-bit values as logical, text, fixed-point, etc., in the software that we write.

Chapter 3 Digital Logic Structures

Transistor: Building Block of Computers Microprocessors contain millions of transistors Intel Pentium 4 (2000): 48 million IBM PowerPC 750FX (2002): 38 million IBM/Apple PowerPC G5 (2003): 58 million Logically, each transistor acts as a switch Combined to implement logic functions AND, OR, NOT Combined to build higher-level structures Adder, multiplexer, decoder, register, … Combined to build processor LC-3 Intel and Pentium are trademarks of Intel Corporation. IBM and PowerPC are trademarks of International Business Machines Corporation. Apple is a trademark of Apple Computer, Inc.

Simple Switch Circuit Switch open: Switch closed: No current through circuit Light is off Vout is +2.9V Switch closed: Short circuit across switch Current flows Light is on Vout is 0V Switch-based circuits can easily represent two states: on/off, open/closed, voltage/no voltage.

n-type MOS Transistor MOS = Metal Oxide Semiconductor n-type Gate = 1 two types: n-type and p-type n-type when Gate has positive voltage, short circuit between #1 and #2 (switch closed) when Gate has zero voltage, open circuit between #1 and #2 (switch open) Gate = 1 Gate = 0 Terminal #2 must be connected to GND (0V).

p-type MOS Transistor p-type is complementary to n-type Gate = 1 when Gate has positive voltage, open circuit between #1 and #2 (switch open) when Gate has zero voltage, short circuit between #1 and #2 (switch closed) Gate = 1 Gate = 0 Terminal #1 must be connected to +2.9V.

Logic Gates Use switch behavior of MOS transistors to implement logical functions: AND, OR, NOT. Digital symbols: recall that we assign a range of analog voltages to each digital (logic) symbol assignment of voltage ranges depends on electrical properties of transistors being used typical values for "1": +5V, +3.3V, +2.9V from now on we'll use +2.9V

CMOS Circuit Complementary MOS Uses both n-type and p-type MOS transistors p-type Attached to + voltage Pulls output voltage UP when input is zero n-type Attached to GND Pulls output voltage DOWN when input is one For all inputs, make sure that output is either connected to GND or to +, but not both!

Inverter (NOT Gate) Truth table In Out 0 V 2.9 V In Out 1

NOR Gate A B C 1 Note: Serial structure on top, parallel on bottom.

OR Gate A B C 1 Add inverter to NOR.

NAND Gate (AND-NOT) A B C 1 1 Note: Parallel structure on top, serial on bottom.

AND Gate A B C 1 Add inverter to NAND.

Basic Logic Gates

invert inputs and output. DeMorgan's Law Converting AND to OR (with some help from NOT) Consider the following gate: To convert AND to OR (or vice versa), invert inputs and output. A B 1 If there's time, perhaps discuss how all gates can be implemented with NAND (or NOR). Therefore, you can implement any truth table using only NAND (or NOR) gates. Same as A+B!

More than 2 Inputs? AND/OR can take any number of inputs. AND = 1 if all inputs are 1. OR = 1 if any input is 1. Similar for NAND/NOR. Can implement with multiple two-input gates, or with single CMOS circuit. NAND and NOR are not associative. Jim Conrad’s example: NAND(NAND(0,0), 1) = NAND(1, 1) = 0 NAND(0, NAND(0,1)) = NAND(0, 0) = 1

Summary MOS transistors are used as switches to implement logic functions. n-type: connect to GND, turn on (with 1) to pull down to 0 p-type: connect to +2.9V, turn on (with 0) to pull up to 1 Basic gates: NOT, NOR, NAND Logic functions are usually expressed with AND, OR, and NOT DeMorgan's Law Convert AND to OR (and vice versa) by inverting inputs and output

Building Functions from Logic Gates Combinational Logic Circuit output depends only on the current inputs stateless Sequential Logic Circuit output depends on the sequence of inputs (past and present) stores information (state) from past inputs We'll first look at some useful combinational circuits, then show how to use sequential circuits to store information.

Decoder 2-bit decoder n inputs, 2n outputs exactly one output is 1 for each possible input pattern 2-bit decoder Uses of decoder: convert memory/register address to a control line that selects that location convert an opcode to one of n control lines

Multiplexer (MUX) 4-to-1 MUX n-bit selector and 2n inputs, one output output equals one of the inputs, depending on selector Another view: decode S, and AND each output with one of the MUX inputs. Also explain multi-bit inputs. Uses of multiplexer: select which input to use for function select which computed value to pass to next stage (or to place on bus) 4-to-1 MUX

Full Adder Add two bits and carry-in, produce one-bit sum and carry-out. A B Cin S Cout 1 A half-adder is one that doesn't take a carry-in. Sum is one when 1 or 3 inputs are one. Carry-out is one when 2 or 3 inputs are one.

Four-bit Adder This is called a "ripple-carry" adder. The sum becomes valid as the carry ripples its way from the low bit to the high bit. How many gate delays until the output is settled?

Logical Completeness Can implement ANY truth table with AND, OR, NOT. 1 1. AND combinations that yield a "1" in the truth table. 2. OR the results of the AND gates. Note the use of the bubbles (NOT) in the input.

Combinational vs. Sequential Combinational Circuit always gives the same output for a given set of inputs ex: adder always generates sum and carry, regardless of previous inputs Sequential Circuit stores information output depends on stored information (state) plus input so a given input might produce different outputs, depending on the stored information example: ticket counter advances when you push the button output depends on previous state useful for building “memory” elements and “state machines”

R-S Latch: Simple Storage Element R is used to “reset” or “clear” the element – set it to zero. S is used to “set” the element – set it to one. If both R and S are one, out could be either zero or one. “quiescent” state -- holds its previous value note: if a is 1, b is 0, and vice versa 1 1 1 1 1 1 1 1 1

Then set R=1 to “store” value in quiescent state. Clearing the R-S latch Suppose we start with output = 1, then change R to zero. 1 1 1 1 Output changes to zero. 1 1 1 1 Setting R to zero forces b (and B) to 1, which forces a (and A) to zero. This is a stable state, because R=0 and A=0 means b=1. Bring R back to one then keeps the output at zero. What is the result if we start with a=0? 1 Then set R=1 to “store” value in quiescent state.

Then set S=1 to “store” value in quiescent state. Setting the R-S Latch Suppose we start with output = 0, then change S to zero. 1 1 1 Output changes to one. 1 1 Setting S to zero forces a (and A) to 1, which forces b (and B) to zero. This is a stable state, because S=0 and B=0 means a=1. Bring S back to one then keeps the output at one. What is the result if we start with a=1? 1 1 Then set S=1 to “store” value in quiescent state.

R-S Latch Summary R = S = 1 S = 0, R=1 R = 0, S = 1 R = S = 0 hold current value in latch S = 0, R=1 set value to 1 R = 0, S = 1 set value to 0 R = S = 0 both outputs equal one final state determined by electrical properties of gates Don’t do it!

Gated D-Latch Two inputs: D (data) and WE (write enable) when WE = 1, latch is set to value of D S = NOT(D), R = D when WE = 0, latch holds previous value S = R = 1 The D-latch is used to store a single data bit. The latch is set to the value of D whenever WE=1; when WE=0, the current value is stored, no matter what D becomes. Using D and not(D) to control S and R makes it easier to ensure that S and R are never zero at the same time. WE allows us to control when a new value is written to the latch.

Register A register stores a multi-bit value. We use a collection of D-latches, all controlled by a common WE. When WE=1, n-bit value D is written to register.

Representing Multi-bit Values Number bits from right (0) to left (n-1) just a convention -- could be left to right, but must be consistent Use brackets to denote range: D[l:r] denotes bit l to bit r, from left to right May also see A<14:9>, especially in hardware block diagrams. 15 A = 0101001101010101 A[14:9] = 101001 A[2:0] = 101

Memory Now that we know how to store bits, we can build a memory – a logical k × m array of stored bits. • Address Space: number of locations (usually a power of 2) k = 2n locations Addressability: number of bits per location (e.g., byte-addressable) m bits

22 x 3 Memory word WE word select input bits address write enable Decoder asserts one of the word select lines, based on address. Word select activates one of the output AND gates, which drives the selected data to the output OR gate. (For a read, this is basically a MUX -- decoder ANDed with signals, results ORed together.) When writing, the only WE bits for the proper word are asserted (based on decoder again). address decoder output bits

Also, non-volatile memories: ROM, PROM, flash, … More Memory Details This is a not the way actual memory is implemented. fewer transistors, much more dense, relies on electrical properties But the logical structure is very similar. address decoder word select line word write enable Two basic kinds of RAM (Random Access Memory) Static RAM (SRAM) fast, maintains data as long as power applied Dynamic RAM (DRAM) slower but denser, bit storage decays – must be periodically refreshed Also, non-volatile memories: ROM, PROM, flash, …

State Machine Another type of sequential circuit Inputs Outputs Combines combinational logic with storage “Remembers” state, and changes output (and state) based on inputs and current state State Machine Inputs Outputs Combinational Logic Circuit Storage Elements

Combinational vs. Sequential Two types of “combination” locks 30 15 5 10 20 25 4 1 8 Combinational Success depends only on the values, not the order in which they are set. Sequential Success depends on the sequence of values (e.g, R-13, L-22, R-3).

State The state of a system is a snapshot of all the relevant elements of the system at the moment the snapshot is taken. Examples: The state of a basketball game can be represented by the scoreboard. Number of points, time remaining, possession, etc. The state of a tic-tac-toe game can be represented by the placement of X’s and O’s on the board.

State of Sequential Lock Our lock example has four different states, labelled A-D: A: The lock is not open, and no relevant operations have been performed. B: The lock is not open, and the user has completed the R-13 operation. C: The lock is not open, and the user has completed R-13, followed by L-22. D: The lock is open.

State Diagram Shows states and actions that cause a transition between states.

Finite State Machine A description of a system with the following components: A finite number of states A finite number of external inputs A finite number of external outputs An explicit specification of all state transitions An explicit specification of what determines each external output value Often described by a state diagram. Inputs trigger state transitions. Outputs are associated with each state (or with each transition).

The Clock Frequently, a clock circuit triggers transition from one state to the next. At the beginning of each clock cycle, state machine makes a transition, based on the current state and the external inputs. Not always required. In lock example, the input itself triggers a transition. “1” “0” One Cycle time

Implementing a Finite State Machine Combinational logic Determine outputs and next state. Storage elements Maintain state representation. State Machine Inputs Outputs Combinational Logic Circuit Storage Elements Clock

Storage: Master-Slave Flipflop A pair of gated D-latches, to isolate next state from current state. During 1st phase (clock=1), previously-computed state becomes current state and is sent to the logic circuit. During 2nd phase (clock=0), next state, computed by logic circuit, is stored in Latch A.

Storage Each master-slave flipflop stores one state bit. The number of storage elements (flipflops) needed is determined by the number of states (and the representation of each state). Examples: Sequential lock Four states – two bits Basketball scoreboard 7 bits for each score, 5 bits for minutes, 6 bits for seconds, 1 bit for possession arrow, 1 bit for half, …

DANGER MOVE RIGHT Complete Example A blinking traffic sign No lights on 1 & 2 on 1, 2, 3, & 4 on 1, 2, 3, 4, & 5 on (repeat as long as switch is turned on) 3 4 1 5 2 DANGER MOVE RIGHT

Traffic Sign State Diagram Switch on Switch off State bit S1 State bit S0 Outputs Transition on each clock cycle.

Traffic Sign Truth Tables Outputs (depend only on state: S1S0) Next State: S1’S0’ (depend on state and input) Switch Lights 1 and 2 Lights 3 and 4 In S1 S0 S1’ S0’ X 1 Light 5 S1 S0 Z Y X 1 Whenever In=0, next state is 00.

Master-slave flipflop Traffic Sign Logic Master-slave flipflop

From Logic to Data Path The data path of a computer is all the logic used to process information. See the data path of the LC-3 on next slide. Combinational Logic Decoders -- convert instructions into control signals Multiplexers -- select inputs and outputs ALU (Arithmetic and Logic Unit) -- operations on data Sequential Logic State machine -- coordinate control signals and data movement Registers and latches -- storage elements

LC-3 Data Path Combinational Logic Storage State Machine

Chapter 4 The Von Neumann Model

The Stored Program Computer 1943: ENIAC Presper Eckert and John Mauchly -- first general electronic computer. (or was it John V. Atanasoff in 1939?) Hard-wired program -- settings of dials and switches. 1944: Beginnings of EDVAC among other improvements, includes program stored in memory 1945: John von Neumann wrote a report on the stored program concept, known as the First Draft of a Report on EDVAC The basic structure proposed in the draft became known as the “von Neumann machine” (or model). a memory, containing instructions and data a processing unit, for performing arithmetic and logical operations a control unit, for interpreting instructions For more history, see http://www.maxmon.com/history.htm

Von Neumann Model

Memory 2k x m array of stored bits Address Contents Basic Operations: unique (k-bit) identifier of location Contents m-bit value stored in location Basic Operations: LOAD read a value from a memory location STORE write a value to a memory location 0000 0001 0010 0011 0100 0101 0110 1101 1110 1111 • 00101101 10100010

Interface to Memory How does processing unit get data to/from memory? MAR: Memory Address Register MDR: Memory Data Register To LOAD a location (A): Write the address (A) into the MAR. Send a “read” signal to the memory. Read the data from MDR. To STORE a value (X) to a location (A): Write the data (X) to the MDR. Send a “write” signal to the memory.

Processing Unit Functional Units Registers Word Size ALU = Arithmetic and Logic Unit could have many functional units. some of them special-purpose (multiply, square root, …) LC-3 performs ADD, AND, NOT Registers Small, temporary storage Operands and results of functional units LC-3 has eight registers (R0, …, R7), each 16 bits wide Word Size number of bits normally processed by ALU in one instruction also width of registers LC-3 is 16 bits

Input and Output Devices for getting data into and out of computer memory Each device has its own interface, usually a set of registers like the memory’s MAR and MDR LC-3 supports keyboard (input) and monitor (output) keyboard: data register (KBDR) and status register (KBSR) monitor: data register (DDR) and status register (DSR) Some devices provide both input and output disk, network Program that controls access to a device is usually called a driver.

Control Unit Orchestrates execution of the program Instruction Register (IR) contains the current instruction. Program Counter (PC) contains the address of the next instruction to be executed. Control unit: reads an instruction from memory the instruction’s address is in the PC interprets the instruction, generating signals that tell the other components what to do an instruction may take many machine cycles to complete

Instruction Processing Fetch instruction from memory Decode instruction Evaluate address Fetch operands from memory Execute operation Store result

Instruction The instruction is the fundamental unit of work. Specifies two things: opcode: operation to be performed operands: data/locations to be used for operation An instruction is encoded as a sequence of bits. (Just like data!) Often, but not always, instructions have a fixed length, such as 16 or 32 bits. Control unit interprets instruction: generates sequence of control signals to carry out operation. Operation is either executed completely, or not at all. A computer’s instructions and their formats is known as its Instruction Set Architecture (ISA).

Example: LC-3 ADD Instruction LC-3 has 16-bit instructions. Each instruction has a four-bit opcode, bits [15:12]. LC-3 has eight registers (R0-R7) for temporary storage. Sources and destination of ADD are registers. “Add the contents of R2 to the contents of R6, and store the result in R6.”

Example: LC-3 LDR Instruction Load instruction -- reads data from memory Base + offset mode: add offset to base register -- result is memory address load from memory address into destination register “Add the value 6 to the contents of R3 to form a memory address. Load the contents of that memory location to R2.”

Instruction Processing: FETCH Load next instruction (at address stored in PC) from memory into Instruction Register (IR). Copy contents of PC into MAR. Send “read” signal to memory. Copy contents of MDR into IR. Then increment PC, so that it points to the next instruction in sequence. PC becomes PC+1. F D EA OP EX S

Instruction Processing: DECODE First identify the opcode. In LC-3, this is always the first four bits of instruction. A 4-to-16 decoder asserts a control line corresponding to the desired opcode. Depending on opcode, identify other operands from the remaining bits. Example: for LDR, last six bits is offset for ADD, last three bits is source operand #2 F D EA OP EX S

Instruction Processing: EVALUATE ADDRESS For instructions that require memory access, compute address used for access. Examples: add offset to base register (as in LDR) add offset to PC add offset to zero F D EA OP EX S

Instruction Processing: FETCH OPERANDS Obtain source operands needed to perform operation. Examples: load data from memory (LDR) read data from register file (ADD) F D EA OP EX S

Instruction Processing: EXECUTE Perform the operation, using the source operands. Examples: send operands to ALU and assert ADD signal do nothing (e.g., for loads and stores) F D EA OP EX S

Instruction Processing: STORE RESULT Write results to destination. (register or memory) Examples: result of ADD is placed in destination register result of memory load is placed in destination register for store instruction, data is stored to memory write address to MAR, data to MDR assert WRITE signal to memory F D EA OP EX S

Changing the Sequence of Instructions In the FETCH phase, we increment the Program Counter by 1. What if we don’t want to always execute the instruction that follows this one? examples: loop, if-then, function call Need special instructions that change the contents of the PC. These are called control instructions. jumps are unconditional -- they always change the PC branches are conditional -- they change the PC only if some condition is true (e.g., the result of an ADD is zero)

Example: LC-3 JMP Instruction Set the PC to the value contained in a register. This becomes the address of the next instruction to fetch. “Load the contents of R3 into the PC.”

Instruction Processing Summary Instructions look just like data -- it’s all interpretation. Three basic kinds of instructions: computational instructions (ADD, AND, …) data movement instructions (LD, ST, …) control instructions (JMP, BRnz, …) Six basic phases of instruction processing: F  D  EA  OP  EX  S not all phases are needed by every instruction phases may take variable number of machine cycles

Control Unit State Diagram The control unit is a state machine. Here is part of a simplified state diagram for the LC-3: A more complete state diagram is in Appendix C. It will be more understandable after Chapter 5.

Stopping the Clock Control unit will repeat instruction processing sequence as long as clock is running. If not processing instructions from your application, then it is processing instructions from the Operating System (OS). The OS is a special program that manages processor and other resources. To stop the computer: AND the clock generator signal with ZERO When control unit stops seeing the CLOCK signal, it stops processing.

Chapter 5 The LC-3

Instruction Set Architecture ISA = All of the programmer-visible components and operations of the computer memory organization address space -- how may locations can be addressed? addressibility -- how many bits per location? register set how many? what size? how are they used? instruction set opcodes data types addressing modes ISA provides all information needed for someone that wants to write a program in machine language (or translate from a high-level language to machine language).

LC-3 Overview: Memory and Registers address space: 216 locations (16-bit addresses) addressability: 16 bits Registers temporary storage, accessed in a single machine cycle accessing memory generally takes longer than a single cycle eight general-purpose registers: R0 - R7 each 16 bits wide how many bits to uniquely identify a register? other registers not directly addressable, but used by (and affected by) instructions PC (program counter), condition codes

LC-3 Overview: Instruction Set Opcodes 15 opcodes Operate instructions: ADD, AND, NOT Data movement instructions: LD, LDI, LDR, LEA, ST, STR, STI Control instructions: BR, JSR/JSRR, JMP, RTI, TRAP some opcodes set/clear condition codes, based on result: N = negative, Z = zero, P = positive (> 0) Data Types 16-bit 2’s complement integer Addressing Modes How is the location of an operand specified? non-memory addresses: immediate, register memory addresses: PC-relative, indirect, base+offset

Operate Instructions Only three operations: ADD, AND, NOT Source and destination operands are registers These instructions do not reference memory. ADD and AND can use “immediate” mode, where one operand is hard-wired into the instruction. Will show dataflow diagram with each instruction. illustrates when and where data moves to accomplish the desired operation

Note: Src and Dst could be the same register. NOT (Register) Note: Src and Dst could be the same register.

this zero means “register mode” ADD/AND (Register)

ADD/AND (Immediate) Note: Immediate field is sign-extended. this one means “immediate mode” ADD/AND (Immediate) Note: Immediate field is sign-extended. .

Using Operate Instructions With only ADD, AND, NOT… How do we subtract? How do we OR? How do we copy from one register to another? How do we initialize a register to zero? Subtract: R3 = R1 - R2 Take 2’s complement of R2, then add to R1. (1) R2 = NOT(R2) (2) R2 = R2 + 1 (3) R3 = R1 + R2 OR: R3 = R1 OR R2 Use DeMorgan’s Law -- invert R1 and R2, AND, then invert result. (1) R1 = NOT(R1) (2) R2 = NOT(R2) (3) R3 = R1 AND R2 (4) R3 = NOT(R3) Register-to-register copy: R3 = R2 R3 = R2 + 0 (Add-immediate) Initialize to zero: R1 = 0 R1 = R1 AND 0 (And-immediate)

Data Movement Instructions Load -- read data from memory to register LD: PC-relative mode LDR: base+offset mode LDI: indirect mode Store -- write data from register to memory ST: PC-relative mode STR: base+offset mode STI: indirect mode Load effective address -- compute address, save in register LEA: immediate mode does not access memory

PC-Relative Addressing Mode Want to specify address directly in the instruction But an address is 16 bits, and so is an instruction! After subtracting 4 bits for opcode and 3 bits for register, we have 9 bits available for address. Solution: Use the 9 bits as a signed offset from the current PC. 9 bits: Can form any address X, such that: Remember that PC is incremented as part of the FETCH phase; This is done before the EVALUATE ADDRESS stage.

LD (PC-Relative)

ST (PC-Relative)

Indirect Addressing Mode With PC-relative mode, can only address data within 256 words of the instruction. What about the rest of memory? Solution #1: Read address from memory location, then load/store to that address. First address is generated from PC and IR (just like PC-relative addressing), then content of that address is used as target for load/store.

LDI (Indirect)

STI (Indirect)

Base + Offset Addressing Mode With PC-relative mode, can only address data within 256 words of the instruction. What about the rest of memory? Solution #2: Use a register to generate a full 16-bit address. 4 bits for opcode, 3 for src/dest register, 3 bits for base register -- remaining 6 bits are used as a signed offset. Offset is sign-extended before adding to base register.

LDR (Base+Offset)

STR (Base+Offset)

Load Effective Address Computes address like PC-relative (PC plus signed offset) and stores the result into a register. Note: The address is stored in the register, not the contents of the memory location.

LEA (Immediate)

Example Address Instruction Comments x30F6 1 1 1 0 0 0 1 1 1 1 1 1 1 1 0 1 R1  PC – 3 = x30F4 x30F7 0 0 0 1 0 1 0 0 0 1 1 0 1 1 1 0 R2  R1 + 14 = x3102 x30F8 0 0 1 1 0 1 0 1 1 1 1 1 1 0 1 1 M[PC - 5]  R2 M[x30F4]  x3102 x30F9 0 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 R2  0 x30FA 0 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 R2  R2 + 5 = 5 x30FB 0 1 1 1 0 1 0 0 0 1 0 0 1 1 1 0 M[R1+14]  R2 M[x3102]  5 x30FC 1 0 1 0 0 1 1 1 1 1 1 1 0 1 1 1 R3  M[M[x30F4]] R3  M[x3102] R3  5 opcode

Control Instructions Used to alter the sequence of instructions (by changing the Program Counter) Conditional Branch branch is taken if a specified condition is true signed offset is added to PC to yield new PC else, the branch is not taken PC is not changed, points to the next sequential instruction Unconditional Branch (or Jump) always changes the PC TRAP changes PC to the address of an OS “service routine” routine will return control to the next instruction (after TRAP)

Condition Codes LC-3 has three condition code registers: N -- negative Z -- zero P -- positive (greater than zero) Set by any instruction that writes a value to a register (ADD, AND, NOT, LD, LDR, LDI, LEA) Exactly one will be set at all times Based on the last instruction that altered a register

Branch Instruction Branch specifies one or more condition codes. If the set bit is specified, the branch is taken. PC-relative addressing: target address is made by adding signed offset (IR[8:0]) to current PC. Note: PC has already been incremented by FETCH stage. Note: Target must be within 256 words of BR instruction. If the branch is not taken, the next sequential instruction is executed.

What happens if bits [11:9] are all zero? All one? BR (PC-Relative) If all zero, no CC is tested, so branch is never taken. (See Appendix B.) If all one, then all are tested. Since at least one of the CC bits is set to one after each operate/load instruction, then branch is always taken. (Assumes some instruction has set CC before branch instruction, otherwise undefined.) What happens if bits [11:9] are all zero? All one?

Using Branch Instructions Compute sum of 12 integers. Numbers start at location x3100. Program starts at location x3000. R1  x3100 R3  0 R2  12 R2=0? R4  M[R1] R3  R3+R4 R1  R1+1 R2  R2-1 NO YES

Sample Program Address Instruction Comments x3000 1 1 1 0 0 0 1 0 1 1 1 1 1 1 1 1 R1  x3100 (PC+0xFF) x3001 0 1 0 1 0 1 1 0 1 1 1 0 0 0 0 0 R3  0 x3002 0 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 R2  0 x3003 0 0 0 1 0 1 0 0 1 0 1 0 1 1 0 0 R2  12 x3004 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 If Z, goto x300A (PC+5) x3005 0 1 1 0 1 0 0 0 0 1 0 0 0 0 0 0 Load next value to R4 x3006 0 0 0 1 0 1 1 0 1 1 0 0 0 1 0 0 Add to R3 x3007 0 0 0 1 0 0 1 0 0 1 1 0 0 0 0 1 Increment R1 (pointer) X3008 0 0 0 1 0 1 0 0 1 0 1 1 1 1 1 1 Decrement R2 (counter) x3009 0 0 0 0 1 1 1 1 1 1 1 1 1 0 1 0 Goto x3004 (PC-6)

JMP (Register) Jump is an unconditional branch -- always taken. Target address is the contents of a register. Allows any target address.

TRAP Calls a service routine, identified by 8-bit “trap vector.” When routine is done, PC is set to the instruction following TRAP. (We’ll talk about how this works later.) vector routine x23 input a character from the keyboard x21 output a character to the monitor x25 halt the program

Another Example Count the occurrences of a character in a file Program begins at location x3000 Read character from keyboard Load each character from a “file” File is a sequence of memory locations Starting address of file is stored in the memory location immediately after the program If file character equals input character, increment counter End of file is indicated by a special ASCII value: EOT (x04) At the end, print the number of characters and halt (assume there will be less than 10 occurrences of the character) A special character used to indicate the end of a sequence is often called a sentinel. Useful when you don’t know ahead of time how many times to execute a loop.

Flow Chart

Program (1 of 2) Address Instruction Comments x3000 0 1 0 1 0 1 0 0 1 0 1 0 0 0 0 0 R2  0 (counter) x3001 0 0 1 0 0 1 1 0 0 0 0 1 0 0 0 0 R3  M[x3012] (ptr) x3002 1 1 1 1 0 0 0 0 0 0 1 0 0 0 1 1 Input to R0 (TRAP x23) x3003 0 1 1 0 0 0 1 0 1 1 0 0 0 0 0 0 R1  M[R3] x3004 0 0 0 1 1 0 0 0 0 1 1 1 1 1 0 0 R4  R1 – 4 (EOT) x3005 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 If Z, goto x300E x3006 1 0 0 1 0 0 1 0 0 1 1 1 1 1 1 1 R1  NOT R1 x3007 0 0 0 1 0 0 1 0 0 1 1 0 0 0 0 1 R1  R1 + 1 X3008 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 R1  R1 + R0 x3009 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 If N or P, goto x300B

Starting Address of File Program (2 of 2) Address Instruction Comments x300A 0 0 0 1 0 1 0 0 1 0 1 0 0 0 0 1 R2  R2 + 1 x300B 0 0 0 1 0 1 1 0 1 1 1 0 0 0 0 1 R3  R3 + 1 x300C 0 1 1 0 0 0 1 0 1 1 0 0 0 0 0 0 R1  M[R3] x300D 0 0 0 0 1 1 1 1 1 1 1 1 0 1 1 0 Goto x3004 x300E 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 R0  M[x3013] x300F 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 R0  R0 + R2 x3010 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 Print R0 (TRAP x21) x3011 1 1 1 1 0 0 0 0 0 0 1 0 0 1 0 1 HALT (TRAP x25) X3012 Starting Address of File x3013 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 ASCII x30 (‘0’)

LC-3 Data Path Revisited Filled arrow = info to be processed. Unfilled arrow = control signal.

Data Path Components Global bus Memory special set of wires that carry a 16-bit signal to many components inputs to the bus are “tri-state devices,” that only place a signal on the bus when they are enabled only one (16-bit) signal should be enabled at any time control unit decides which signal “drives” the bus any number of components can read the bus register only captures bus data if it is write-enabled by the control unit Memory Control and data registers for memory and I/O devices memory: MAR, MDR (also control signal for read/write)

Data Path Components ALU Register File Accepts inputs from register file and from sign-extended bits from IR (immediate field). Output goes to bus. used by condition code logic, register file, memory Register File Two read addresses (SR1, SR2), one write address (DR) Input from bus result of ALU operation or memory read Two 16-bit outputs used by ALU, PC, memory address data for store instructions passes through ALU

Data Path Components PC and PCMUX MAR and MARMUX Three inputs to PC, controlled by PCMUX PC+1 – FETCH stage Address adder – BR, JMP bus – TRAP (discussed later) MAR and MARMUX Two inputs to MAR, controlled by MARMUX Address adder – LD/ST, LDR/STR Zero-extended IR[7:0] -- TRAP (discussed later)

Data Path Components Condition Code Logic Looks at value on bus and generates N, Z, P signals Registers set only when control unit enables them (LD.CC) only certain instructions set the codes (ADD, AND, NOT, LD, LDI, LDR, LEA) Control Unit – Finite State Machine On each machine cycle, changes control signals for next phase of instruction processing who drives the bus? (GatePC, GateALU, …) which registers are write enabled? (LD.IR, LD.REG, …) which operation should ALU perform? (ALUK) … Logic includes decoder for opcode, etc.

Chapter 6 Programming

Solving Problems using a Computer Methodologies for creating computer programs that perform a desired function. Problem Solving How do we figure out what to tell the computer to do? Convert problem statement into algorithm, using stepwise refinement. Convert algorithm into LC-3 machine instructions. Debugging How do we figure out why it didn’t work? Examining registers and memory, setting breakpoints, etc. Time spent on the first can reduce time spent on the second!

Stepwise Refinement Also known as systematic decomposition. Start with problem statement: “We wish to count the number of occurrences of a character in a file. The character in question is to be input from the keyboard; the result is to be displayed on the monitor.” Decompose task into a few simpler subtasks. Decompose each subtask into smaller subtasks, and these into even smaller subtasks, etc.... until you get to the machine instruction level.

Problem Statement Because problem statements are written in English, they are sometimes ambiguous and/or incomplete. Where is “file” located? How big is it, or how do I know when I’ve reached the end? How should final count be printed? A decimal number? If the character is a letter, should I count both upper-case and lower-case occurrences? How do you resolve these issues? Ask the person who wants the problem solved, or Make a decision and document it.

Three Basic Constructs There are three basic ways to decompose a task:

Sequential Do Subtask 1 to completion, then do Subtask 2 to completion, etc.

Conditional If condition is true, do Subtask 1; else, do Subtask 2.

Iterative Do Subtask over and over, as long as the test condition is true.

Problem Solving Skills Learn to convert problem statement into step-by-step description of subtasks. Like a puzzle, or a “word problem” from grammar school math. What is the starting state of the system? What is the desired ending state? How do we move from one state to another? Recognize English words that correlate to three basic constructs: “do A then do B”  sequential “if G, then do H”  conditional “for each X, do Y”  iterative “do Z until W”  iterative

LC-3 Control Instructions How do we use LC-3 instructions to encode the three basic constructs? Sequential Instructions naturally flow from one to the next, so no special instruction needed to go from one sequential subtask to the next. Conditional and Iterative Create code that converts condition into N, Z, or P. Example: Condition: “Is R0 = R1?” Code: Subtract R1 from R0; if equal, Z bit will be set. Then use BR instruction to transfer control to the proper subtask.

Unconditional branch to Next Subtask Code for Conditional PC offset to address C Exact bits depend on condition being tested Unconditional branch to Next Subtask PC offset to address D Assuming all addresses are close enough that PC-relative branch can be used.

Code for Iteration Assuming all addresses are on the same page. PC offset to address C Exact bits depend on condition being tested Unconditional branch to retest condition PC offset to address A Assuming all addresses are on the same page.

Example: Counting Characters Initial refinement: Big task into three sequential subtasks.

Refining B into iterative construct.

Refining B1 into sequential subtasks.

Refining B2 and B3 Conditional (B2) and sequential (B3). Use of LC-2 registers and instructions.

The Last Step: LC-3 Instructions Use comments to separate into modules and to document your code. ; Look at each char in file. 0001100001111100 ; is R1 = EOT? 0000010xxxxxxxxx ; if so, exit loop ; Check for match with R0. 1001001001111111 ; R1 = -char 0001001001100001 0001001000000001 ; R1 = R0 – char 0000101xxxxxxxxx ; no match, skip incr 0001010010100001 ; R2 = R2 + 1 ; Incr file ptr and get next char 0001011011100001 ; R3 = R3 + 1 0110001011000000 ; R1 = M[R3] Don’t know PCoffset bits until all the code is done

Debugging You’ve written your program and it doesn’t work. Now what? What do you do when you’re lost in a city? Drive around randomly and hope you find it? Return to a known point and look at a map? In debugging, the equivalent to looking at a map is tracing your program. Examine the sequence of instructions being executed. Keep track of results being produced. Compare result from each instruction to the expected result.

Debugging Operations Any debugging environment should provide means to: Display values in memory and registers. Deposit values in memory and registers. Execute instruction sequence in a program. Stop execution when desired. Different programming levels offer different tools. High-level languages (C, Java, ...) usually have source-code debugging tools. For debugging at the machine instruction level: simulators operating system “monitor” tools in-circuit emulators (ICE) plug-in hardware replacements that give instruction-level control

LC-3 Simulator stop execution, set breakpoints execute instruction sequences set/display registers and memory

Types of Errors Syntax Errors Logic Errors Data Errors You made a typing error that resulted in an illegal operation. Not usually an issue with machine language, because almost any bit pattern corresponds to some legal instruction. In high-level languages, these are often caught during the translation from language to machine code. Logic Errors Your program is legal, but wrong, so the results don’t match the problem statement. Trace the program to see what’s really happening and determine how to get the proper behavior. Data Errors Input data is different than what you expected. Test the program with a wide variety of inputs.

Tracing the Program Execute the program one piece at a time, examining register and memory to see results at each step. Single-Stepping Execute one instruction at a time. Tedious, but useful to help you verify each step of your program. Breakpoints Tell the simulator to stop executing when it reaches a specific instruction. Check overall results at specific points in the program. Lets you quickly execute sequences to get a high-level overview of the execution behavior. Quickly execute sequences that your believe are correct. Watchpoints Tell the simulator to stop when a register or memory location changes or when it equals a specific value. Useful when you don’t know where or when a value is changed.

Example 1: Multiply This program is supposed to multiply the two unsigned integers in R4 and R5. clear R2 add R4 to R2 decrement R5 R5 = 0? HALT No Yes x3200 0101010010100000 x3201 0001010010000100 x3202 0001101101111111 x3203 0000011111111101 x3204 1111000000100101 Set R4 = 10, R5 =3. Run program. Result: R2 = 40, not 30.

Debugging the Multiply Program Single-stepping PC R2 R4 R5 x3200 -- 10 3 x3201 x3202 x3203 2 20 1 30 40 -1 x3204 Breakpoint at branch (x3203) PC and registers at the beginning of each instruction PC R2 R4 R5 x3203 10 2 20 1 30 40 -1 Should stop looping here! Executing loop one time too many. Branch at x3203 should be based on Z bit only, not Z and P.

Example 2: Summing an Array of Numbers This program is supposed to sum the numbers stored in 10 locations beginning with x3100, leaving the result in R1. R1 = 0 R4 = 10 R2 = x3100 x3000 0101001001100000 x3001 0101100100100000 x3002 0001100100101010 x3003 0010010011111100 x3004 0110011010000000 x3005 0001010010100001 x3006 0001001001000011 x3007 0001100100111111 x3008 0000001111111011 x3009 1111000000100101 R1 = R1 + M[R2] R2 = R2 + 1 R4 = R4 - 1 R4 = 0? No Yes HALT

Debugging the Summing Program Running the the data below yields R1 = x0024, but the sum should be x8135. What happened? Address Contents x3100 x3107 x3101 x2819 x3102 x0110 x3103 x0310 x3104 x3105 x1110 x3106 x11B1 x0019 x3108 x0007 x3109 x0004 Start single-stepping program... PC R1 R2 R4 x3000 -- x3001 x3002 x3003 10 x3004 x3107 Should be x3100! Loading contents of M[x3100], not address. Change opcode of x3003 from 0010 (LD) to 1110 (LEA).

Example 3: Looking for a 5 This program is supposed to set R0=1 if there’s a 5 in one ten memory locations, starting at x3100. Else, it should set R0 to 0. x3000 0101000000100000 x3001 0001000000100001 x3002 0101001001100000 x3003 0001001001111011 x3004 0101011011100000 x3005 0001011011101010 x3006 0010100000001001 x3007 0110010100000000 x3008 0001010010000001 x3009 0000010000000101 x300A 0001100100100001 x300B 0001011011111111 x300C 0110010100000000 x300D 0000001111111010 x300E 0101000000100000 x300F 1111000000100101 x3010 0011000100000000 R0 = 1, R1 = -5, R3 = 10 R4 = x3100, R2 = M[R4] R2 = 5? Yes No R3 = 0? R4 = R4 + 1 R3 = R3-1 R2 = M[R4] No Yes R0 = 0 HALT

Debugging the Fives Program Running the program with a 5 in location x3108 results in R0 = 0, not R0 = 1. What happened? Perhaps we didn’t look at all the data? Put a breakpoint at x300D to see how many times we branch back. Address Contents x3100 9 x3101 7 x3102 32 x3103 x3104 -8 x3105 19 x3106 6 x3107 13 x3108 5 x3109 61 PC R0 R2 R3 R4 x300D 1 7 9 x3101 32 8 x3102 x3103 Didn’t branch back, even though R3 > 0? Branch uses condition code set by loading R2 with M[R4], not by decrementing R3. Swap x300B and x300C, or remove x300C and branch back to x3007.

Example 4: Finding First 1 in a Word This program is supposed to return (in R1) the bit position of the first 1 in a word. The address of the word is in location x3009 (just past the end of the program). If there are no ones, R1 should be set to –1. R1 = 15 R2 = data x3000 0101001001100000 x3001 0001001001101111 x3002 1010010000000110 x3003 0000100000000100 x3004 0001001001111111 x3005 0001010010000010 x3006 0000100000000001 x3007 0000111111111100 x3008 1111000000100101 x3009 0011000100000000 R2[15] = 1? Yes No decrement R1 shift R2 left one bit R2[15] = 1? No Yes HALT

Debugging the First-One Program Program works most of the time, but if data is zero, it never seems to HALT. Breakpoint at backwards branch (x3007) PC R1 x3007 14 13 12 11 10 9 8 7 6 5 PC R1 x3007 4 3 2 1 -1 -2 -3 -4 -5 If no ones, then branch to HALT never occurs! This is called an “infinite loop.” Must change algorithm to either (a) check for special case (R2=0), or (b) exit loop if R1 < 0.

Debugging: Lessons Learned Trace program to see what’s going on. Breakpoints, single-stepping When tracing, make sure to notice what’s really happening, not what you think should happen. In summing program, it would be easy to not notice that address x3107 was loaded instead of x3100. Test your program using a variety of input data. In Examples 3 and 4, the program works for many data sets. Be sure to test extreme cases (all ones, no ones, ...).

Chapter 7 Assembly Language

Human-Readable Machine Language Computers like ones and zeros… Humans like symbols… Assembler is a program that turns symbols into machine instructions. ISA-specific: close correspondence between symbols and instruction set mnemonics for opcodes labels for memory locations additional operations for allocating storage and initializing data 0001110010000110 ADD R6,R2,R6 ; increment index reg.

An Assembly Language Program ; ; Program to multiply a number by the constant 6 .ORIG x3050 LD R1, SIX LD R2, NUMBER AND R3, R3, #0 ; Clear R3. It will ; contain the product. ; The inner loop AGAIN ADD R3, R3, R2 ADD R1, R1, #-1 ; R1 keeps track of BRp AGAIN ; the iteration. HALT NUMBER .BLKW 1 SIX .FILL x0006 .END

LC-3 Assembly Language Syntax Each line of a program is one of the following: an instruction an assember directive (or pseudo-op) a comment Whitespace (between symbols) and case are ignored. Comments (beginning with “;”) are also ignored. An instruction has the following format: LABEL OPCODE OPERANDS ; COMMENTS optional mandatory

Opcodes and Operands Opcodes Operands reserved symbols that correspond to LC-3 instructions listed in Appendix A ex: ADD, AND, LD, LDR, … Operands registers -- specified by Rn, where n is the register number numbers -- indicated by # (decimal) or x (hex) label -- symbolic name of memory location separated by comma number, order, and type correspond to instruction format ex: ADD R1,R1,R3 ADD R1,R1,#3 LD R6,NUMBER BRz LOOP

Labels and Comments Label Comment placed at the beginning of the line assigns a symbolic name to the address corresponding to line ex: LOOP ADD R1,R1,#-1 BRp LOOP Comment anything after a semicolon is a comment ignored by assembler used by humans to document/understand programs tips for useful comments: avoid restating the obvious, as “decrement R1” provide additional insight, as in “accumulate product in R6” use comments to separate pieces of program

Assembler Directives Pseudo-operations do not refer to operations executed by program used by assembler look like instruction, but “opcode” starts with dot Opcode Operand Meaning .ORIG address starting address of program .END end of program .BLKW n allocate n words of storage .FILL allocate one word, initialize with value n .STRINGZ n-character string allocate n+1 locations, initialize w/characters and null terminator

Trap Codes LC-3 assembler provides “pseudo-instructions” for each trap code, so you don’t have to remember them. Code Equivalent Description HALT TRAP x25 Halt execution and print message to console. IN TRAP x23 Print prompt on console, read (and echo) one character from keybd. Character stored in R0[7:0]. OUT TRAP x21 Write one character (in R0[7:0]) to console. GETC TRAP x20 Read one character from keyboard. Character stored in R0[7:0]. PUTS TRAP x22 Write null-terminated string to console. Address of string is in R0.

Style Guidelines Use the following style guidelines to improve the readability and understandability of your programs: Provide a program header, with author’s name, date, etc., and purpose of program. Start labels, opcode, operands, and comments in same column for each line. (Unless entire line is a comment.) Use comments to explain what each register does. Give explanatory comment for most instructions. Use meaningful symbolic names. Mixed upper and lower case for readability. ASCIItoBinary, InputRoutine, SaveR1 Provide comments between program sections. Each line must fit on the page -- no wraparound or truncations. Long statements split in aesthetically pleasing manner.

Sample Program Count the occurrences of a character in a file. Remember this?

Char Count in Assembly Language (1 of 3) ; ; Program to count occurrences of a character in a file. ; Character to be input from the keyboard. ; Result to be displayed on the monitor. ; Program only works if no more than 9 occurrences are found. ; Initialization .ORIG x3000 AND R2, R2, #0 ; R2 is counter, initially 0 LD R3, PTR ; R3 is pointer to characters GETC ; R0 gets character input LDR R1, R3, #0 ; R1 gets first character ; Test character for end of file TEST ADD R4, R1, #-4 ; Test for EOT (ASCII x04) BRz OUTPUT ; If done, prepare the output

Char Count in Assembly Language (2 of 3) ; ; Test character for match. If a match, increment count. NOT R1, R1 ADD R1, R1, R0 ; If match, R1 = xFFFF NOT R1, R1 ; If match, R1 = x0000 BRnp GETCHAR ; If no match, do not increment ADD R2, R2, #1 ; Get next character from file. GETCHAR ADD R3, R3, #1 ; Point to next character. LDR R1, R3, #0 ; R1 gets next char to test BRnzp TEST ; Output the count. OUTPUT LD R0, ASCII ; Load the ASCII template ADD R0, R0, R2 ; Covert binary count to ASCII OUT ; ASCII code in R0 is displayed. HALT ; Halt machine

Char Count in Assembly Language (3 of 3) ; ; Storage for pointer and ASCII template ASCII .FILL x0030 PTR .FILL x4000 .END

Assembly Process Convert assembly language file (.asm) into an executable file (.obj) for the LC-3 simulator. First Pass: scan program file find all labels and calculate the corresponding addresses; this is called the symbol table Second Pass: convert instructions to machine language, using information from symbol table

First Pass: Constructing the Symbol Table Find the .ORIG statement, which tells us the address of the first instruction. Initialize location counter (LC), which keeps track of the current instruction. For each non-empty line in the program: If line contains a label, add label and LC to symbol table. Increment LC. NOTE: If statement is .BLKW or .STRINGZ, increment LC by the number of words allocated. Stop when .END statement is reached. NOTE: A line that contains only a comment is considered an empty line.

Practice Symbol Address Construct the symbol table for the program in Figure 7.1 (Slides 7-11 through 7-13). Symbol Address

Second Pass: Generating Machine Language For each executable assembly language statement, generate the corresponding machine language instruction. If operand is a label, look up the address from the symbol table. Potential problems: Improper number or type of arguments ex: NOT R1,#7 ADD R1,R2 ADD R3,R3,NUMBER Immediate argument too large ex: ADD R1,R2,#1023 Address (associated with label) more than 256 from instruction can’t use PC-relative addressing mode

Practice Statement Machine Language Using the symbol table constructed earlier, translate these statements into LC-3 machine language. Statement Machine Language LD R3,PTR ADD R4,R1,#-4 LDR R1,R3,#0 BRnp GETCHAR

LC-3 Assembler Using “assemble” (Unix) or LC3Edit (Windows), generates several different output files. This one gets loaded into the simulator.

Object File Format LC-3 object file contains Example Starting address (location where program must be loaded), followed by… Machine instructions Example Beginning of “count character” object file looks like this: 0011000000000000 0101010010100000 0010011000010001 1111000000100011 . . . .ORIG x3000 AND R2, R2, #0 LD R3, PTR TRAP x23

Multiple Object Files An object file is not necessarily a complete program. system-provided library routines code blocks written by multiple developers For LC-3 simulator, can load multiple object files into memory, then start executing at a desired address. system routines, such as keyboard input, are loaded automatically loaded into “system memory,” below x3000 user code should be loaded between x3000 and xFDFF each object file includes a starting address be careful not to load overlapping object files

Linking and Loading Loading is the process of copying an executable image into memory. more sophisticated loaders are able to relocate images to fit into available memory must readjust branch targets, load/store addresses Linking is the process of resolving symbols between independent object files. suppose we define a symbol in one module, and want to use it in another some notation, such as .EXTERNAL, is used to tell assembler that a symbol is defined in another module linker will search symbol tables of other modules to resolve symbols and complete code generation before loading

Chapter 8 I/O

I/O: Connecting to Outside World So far, we’ve learned how to: compute with values in registers load data from memory to registers store data from registers to memory But where does data in memory come from? And how does data get out of the system so that humans can use it?

I/O: Connecting to the Outside World Types of I/O devices characterized by: behavior: input, output, storage input: keyboard, motion detector, network interface output: monitor, printer, network interface storage: disk, CD-ROM data rate: how fast can data be transferred? keyboard: 100 bytes/sec disk: 30 MB/s network: 1 Mb/s - 1 Gb/s

I/O Controller CPU Control/Status Registers Data Registers CPU tells device what to do -- write to control register CPU checks whether task is done -- read status register Data Registers CPU transfers data to/from device Device electronics performs actual operation pixels to screen, bits to/from disk, characters from keyboard Graphics Controller Control/Status CPU Electronics display Output Data

Programming Interface How are device registers identified? Memory-mapped vs. special instructions How is timing of transfer managed? Asynchronous vs. synchronous Who controls transfer? CPU (polling) vs. device (interrupts)

Memory-Mapped vs. I/O Instructions designate opcode(s) for I/O register and operation encoded in instruction Memory-mapped assign a memory address to each device register use data movement instructions (LD/ST) for control and data transfer

Transfer Timing I/O events generally happen much slower than CPU cycles. Synchronous data supplied at a fixed, predictable rate CPU reads/writes every X cycles Asynchronous data rate less predictable CPU must synchronize with device, so that it doesn’t miss data or write too quickly

Transfer Control Who determines when the next data transfer occurs? Polling CPU keeps checking status register until new data arrives OR device ready for next data “Are we there yet? Are we there yet? Are we there yet?” Interrupts Device sends a special signal to CPU when new data arrives OR device ready for next data CPU can be performing other tasks instead of polling device. “Wake me when we get there.”

LC-3 Memory-mapped I/O (Table A.3) xFE00 xFE02 xFE04 Asynchronous devices synchronized through status registers Polling and Interrupts the details of interrupts will be discussed in Chapter 10 Location I/O Register Function xFE00 Keyboard Status Reg (KBSR) Bit [15] is one when keyboard has received a new character. xFE02 Keyboard Data Reg (KBDR) Bits [7:0] contain the last character typed on keyboard. xFE04 Display Status Register (DSR) Bit [15] is one when device ready to display another char on screen. xFE06 Display Data Register (DDR) Character written to bits [7:0] will be displayed on screen.

Input from Keyboard When a character is typed: When KBDR is read: its ASCII code is placed in bits [7:0] of KBDR (bits [15:8] are always zero) the “ready bit” (KBSR[15]) is set to one keyboard is disabled -- any typed characters will be ignored When KBDR is read: KBSR[15] is set to zero keyboard is enabled keyboard data 15 8 7 KBDR 15 14 ready bit KBSR

Basic Input Routine POLL LDI R0, KBSRPtr BRzp POLL LDI R0, KBDRPtr ... KBSRPtr .FILL xFE00 KBDRPtr .FILL xFE02 new char? NO Polling YES read character

Simple Implementation: Memory-Mapped Input Address Control Logic determines whether MDR is loaded from Memory or from KBSR/KBDR.

Output to Monitor When Monitor is ready to display another character: the “ready bit” (DSR[15]) is set to one When data is written to Display Data Register: DSR[15] is set to zero character in DDR[7:0] is displayed any other character data written to DDR is ignored (while DSR[15] is zero) output data 15 8 7 DDR 15 14 ready bit DSR

Basic Output Routine POLL LDI R1, DSRPtr BRzp POLL STI R0, DDRPtr ... DSRPtr .FILL xFE04 DDRPtr .FILL xFE06 screen ready? NO Polling YES write character

Simple Implementation: Memory-Mapped Output Sets LD.DDR or selects DSR as input.

Keyboard Echo Routine Usually, input character is also printed to screen. User gets feedback on character typed and knows its ok to type the next character. new char? POLL1 LDI R0, KBSRPtr BRzp POLL1 LDI R0, KBDRPtr POLL2 LDI R1, DSRPtr BRzp POLL2 STI R0, DDRPtr ... KBSRPtr .FILL xFE00 KBDRPtr .FILL xFE02 DSRPtr .FILL xFE04 DDRPtr .FILL xFE06 NO YES read character screen ready? NO YES write character

Interrupt-Driven I/O External device can: Force currently executing program to stop; Have the processor satisfy the device’s needs; and Resume the stopped program as if nothing happened. Why? Polling consumes a lot of cycles, especially for rare events – these cycles can be used for more computation. Example: Process previous input while collecting current input. (See Example 8.1 in text.)

Interrupt-Driven I/O To implement an interrupt mechanism, we need: A way for the I/O device to signal the CPU that an interesting event has occurred. A way for the CPU to test whether the interrupt signal is set and whether its priority is higher than the current program. Generating Signal Software sets "interrupt enable" bit in device register. When ready bit is set and IE bit is set, interrupt is signaled. interrupt enable bit 15 14 13 ready bit KBSR interrupt signal to processor

Priority Every instruction executes at a stated level of urgency. LC-3: 8 priority levels (PL0-PL7) Example: Payroll program runs at PL0. Nuclear power correction program runs at PL6. It’s OK for PL6 device to interrupt PL0 program, but not the other way around. Priority encoder selects highest-priority device, compares to current processor priority level, and generates interrupt signal if appropriate.

Testing for Interrupt Signal CPU looks at signal between STORE and FETCH phases. If not set, continues with next instruction. If set, transfers control to interrupt service routine. F NO D interrupt signal? Transfer to ISR YES EA OP EX More details in Chapter 10. S

Full Implementation of LC-3 Memory-Mapped I/O Because of interrupt enable bits, status registers (KBSR/DSR) must be written, as well as read.

Review Questions What is the danger of not testing the DSR before writing data to the screen? What is the danger of not testing the KBSR before reading data from the keyboard? What if the Monitor were a synchronous device, e.g., we know that it will be ready 1 microsecond after character is written. Can we avoid polling? How? What are advantages and disadvantages?

Review Questions Do you think polling is a good approach for other devices, such as a disk or a network interface? What is the advantage of using LDI/STI for accessing device registers?

Chapter 9 TRAP Routines and Subroutines

System Calls Certain operations require specialized knowledge and protection: specific knowledge of I/O device registers and the sequence of operations needed to use them I/O resources shared among multiple users/programs; a mistake could affect lots of other users! Not every programmer knows (or wants to know) this level of detail Provide service routines or system calls (part of operating system) to safely and conveniently perform low-level, privileged operations

In LC-3, this is done through the TRAP mechanism. System Call 1. User program invokes system call. 2. Operating system code performs operation. 3. Returns control to user program. In LC-3, this is done through the TRAP mechanism.

LC-3 TRAP Mechanism 1. A set of service routines. part of operating system -- routines start at arbitrary addresses (convention is that system code is below x3000) up to 256 routines 2. Table of starting addresses. stored at x0000 through x00FF in memory called System Control Block in some architectures 3. TRAP instruction. used by program to transfer control to operating system 8-bit trap vector names one of the 256 service routines 4. A linkage back to the user program. want execution to resume immediately after the TRAP instruction

TRAP Instruction Trap vector Where to go How to get back identifies which system call to invoke 8-bit index into table of service routine addresses in LC-3, this table is stored in memory at 0x0000 – 0x00FF 8-bit trap vector is zero-extended into 16-bit memory address Where to go lookup starting address from table; place in PC How to get back save address of next instruction (current PC) in R7

TRAP NOTE: PC has already been incremented during instruction fetch stage.

RET (JMP R7) How do we transfer control back to instruction following the TRAP? We saved old PC in R7. JMP R7 gets us back to the user program at the right spot. LC-3 assembly language lets us use RET (return) in place of “JMP R7”. Must make sure that service routine does not change R7, or we won’t know where to return.

TRAP Mechanism Operation Lookup starting address. Transfer to service routine. Return (JMP R7).

Example: Using the TRAP Instruction .ORIG x3000 LD R2, TERM ; Load negative ASCII ‘7’ LD R3, ASCII ; Load ASCII difference AGAIN TRAP x23 ; input character ADD R1, R2, R0 ; Test for terminate BRz EXIT ; Exit if done ADD R0, R0, R3 ; Change to lowercase TRAP x21 ; Output to monitor... BRnzp AGAIN ; ... again and again... TERM .FILL xFFC9 ; -‘7’ ASCII .FILL x0020 ; lowercase bit EXIT TRAP x25 ; halt .END

Example: Output Service Routine .ORIG x0430 ; syscall address ST R7, SaveR7 ; save R7 & R1 ST R1, SaveR1 ; ----- Write character TryWrite LDI R1, CRTSR ; get status BRzp TryWrite ; look for bit 15 on WriteIt STI R0, CRTDR ; write char ; ----- Return from TRAP Return LD R1, SaveR1 ; restore R1 & R7 LD R7, SaveR7 RET ; back to user CRTSR .FILL xF3FC CRTDR .FILL xF3FF SaveR1 .FILL 0 SaveR7 .FILL 0 .END stored in table, location x21

TRAP Routines and their Assembler Names vector symbol routine x20 GETC read a single character (no echo) x21 OUT output a character to the monitor x22 PUTS write a string to the console x23 IN print prompt to console, read and echo character from keyboard x25 HALT halt the program

Saving and Restoring Registers Must save the value of a register if: Its value will be destroyed by service routine, and We will need to use the value after that action. Who saves? caller of service routine? knows what it needs later, but may not know what gets altered by called routine called service routine? knows what it alters, but does not know what will be needed later by calling routine

What’s wrong with this routine? What happens to R7? Example LEA R3, Binary LD R6, ASCII ; char->digit template LD R7, COUNT ; initialize to 10 AGAIN TRAP x23 ; Get char ADD R0, R0, R6 ; convert to number STR R0, R3, #0 ; store number ADD R3, R3, #1 ; incr pointer ADD R7, R7, -1 ; decr counter BRp AGAIN ; more? BRnzp NEXT ASCII .FILL xFFD0 COUNT .FILL #10 Binary .BLKW #10 What’s wrong with this routine? What happens to R7?

Saving and Restoring Registers Called routine -- “callee-save” Before start, save any registers that will be altered (unless altered value is desired by calling program!) Before return, restore those same registers Calling routine -- “caller-save” Save registers destroyed by own instructions or by called routines (if known), if values needed later save R7 before TRAP save R0 before TRAP x23 (input character) Or avoid using those registers altogether Values are saved by storing them in memory.

Question Can a service routine call another service routine? If so, is there anything special the calling service routine must do?

What about User Code? Service routines provide three main functions: 1. Shield programmers from system-specific details. 2. Write frequently-used code just once. 3. Protect system resources from malicious/clumsy programmers. Are there any reasons to provide the same functions for non-system (user) code?

Subroutines A subroutine is a program fragment that: lives in user space performs a well-defined task is invoked (called) by another user program returns control to the calling program when finished Like a service routine, but not part of the OS not concerned with protecting hardware resources no special privilege required Reasons for subroutines: reuse useful (and debugged!) code without having to keep typing it in divide task among multiple programmers use vendor-supplied library of useful routines

JSR Instruction Jumps to a location (like a branch but unconditional), and saves current PC (addr of next instruction) in R7. saving the return address is called “linking” target address is PC-relative (PC + Sext(IR[10:0])) bit 11 specifies addressing mode if =1, PC-relative: target address = PC + Sext(IR[10:0]) if =0, register: target address = contents of register IR[8:6]

JSR NOTE: PC has already been incremented during instruction fetch stage.

JSRR Instruction Just like JSR, except Register addressing mode. target address is Base Register bit 11 specifies addressing mode What important feature does JSRR provide that JSR does not?

JSRR NOTE: PC has already been incremented during instruction fetch stage.

Returning from a Subroutine RET (JMP R7) gets us back to the calling routine. just like TRAP

Example: Negate the value in R0 2sComp NOT R0, R0 ; flip bits ADD R0, R0, #1 ; add one RET ; return to caller To call from a program (within 1024 instructions): ; need to compute R4 = R1 - R3 ADD R0, R3, #0 ; copy R3 to R0 JSR 2sComp ; negate ADD R4, R1, R0 ; add to R1 ... Note: Caller should save R0 if we’ll need it later!

Passing Information to/from Subroutines Arguments A value passed in to a subroutine is called an argument. This is a value needed by the subroutine to do its job. Examples: In 2sComp routine, R0 is the number to be negated In OUT service routine, R0 is the character to be printed. In PUTS routine, R0 is address of string to be printed. Return Values A value passed out of a subroutine is called a return value. This is the value that you called the subroutine to compute. In 2sComp routine, negated value is returned in R0. In GETC service routine, character read from the keyboard is returned in R0.

Using Subroutines In order to use a subroutine, a programmer must know: its address (or at least a label that will be bound to its address) its function (what does it do?) NOTE: The programmer does not need to know how the subroutine works, but what changes are visible in the machine’s state after the routine has run. its arguments (where to pass data in, if any) its return values (where to get computed data, if any)

Saving and Restore Registers Since subroutines are just like service routines, we also need to save and restore registers, if needed. Generally use “callee-save” strategy, except for return values. Save anything that the subroutine will alter internally that shouldn’t be visible when the subroutine returns. It’s good practice to restore incoming arguments to their original values (unless overwritten by return value). Remember: You MUST save R7 if you call any other subroutine or service routine (TRAP). Otherwise, you won’t be able to return to caller.

Example Write a subroutine FirstChar to: find the first occurrence of a particular character (in R0) in a string (pointed to by R1); return pointer to character or to end of string (NULL) in R2. (2) Use FirstChar to write CountChar, which: counts the number of occurrences of a particular character (in R0) in a string (pointed to by R1); return count in R2. Can write the second subroutine first, without knowing the implementation of FirstChar!

CountChar Algorithm (using FirstChar) save regs call FirstChar R3 <- M(R2) R3=0 R1 <- R2 + 1 restore regs return no yes save R7, since we’re using JSR

CountChar Implementation ; CountChar: subroutine to count occurrences of a char CountChar ST R3, CCR3 ; save registers ST R4, CCR4 ST R7, CCR7 ; JSR alters R7 ST R1, CCR1 ; save original string ptr AND R4, R4, #0 ; initialize count to zero CC1 JSR FirstChar ; find next occurrence (ptr in R2) LDR R3, R2, #0 ; see if char or null BRz CC2 ; if null, no more chars ADD R4, R4, #1 ; increment count ADD R1, R2, #1 ; point to next char in string BRnzp CC1 CC2 ADD R2, R4, #0 ; move return val (count) to R2 LD R3, CCR3 ; restore regs LD R4, CCR4 LD R1, CCR1 LD R7, CCR7 RET ; and return

FirstChar Algorithm R3=R0 save regs R2 <- R1 R2 <- R2 + 1 R3 <- M(R2) R3=0 R3=R0 R2 <- R2 + 1 restore regs return no yes

FirstChar Implementation ; FirstChar: subroutine to find first occurrence of a char FirstChar ST R3, FCR3 ; save registers ST R4, FCR4 ; save original char NOT R4, R0 ; negate R0 for comparisons ADD R4, R4, #1 ADD R2, R1, #0 ; initialize ptr to beginning of string FC1 LDR R3, R2, #0 ; read character BRz FC2 ; if null, we’re done ADD R3, R3, R4 ; see if matches input char BRz FC2 ; if yes, we’re done ADD R2, R2, #1 ; increment pointer BRnzp FC1 FC2 LD R3, FCR3 ; restore registers LD R4, FCR4 ; RET ; and return

Library Routines Vendor may provide object files containing useful subroutines don’t want to provide source code -- intellectual property assembler/linker must support EXTERNAL symbols (or starting address of routine must be supplied to user) ... .EXTERNAL SQRT ... LD R2, SQAddr ; load SQRT addr JSRR R2 ... SQAddr .FILL SQRT Using JSRR, because we don’t know whether SQRT is within 1024 instructions.

Chapter 10 And, Finally... The Stack

Stack: An Abstract Data Type An important abstraction that you will encounter in many applications. We will describe three uses: Interrupt-Driven I/O The rest of the story… Evaluating arithmetic expressions Store intermediate results on stack instead of in registers Data type conversion 2’s comp binary to ASCII strings

Stacks A LIFO (last-in first-out) storage structure. The first thing you put in is the last thing you take out. The last thing you put in is the first thing you take out. This means of access is what defines a stack, not the specific implementation. Two main operations: PUSH: add an item to the stack POP: remove an item from the stack

A Physical Stack Coin rest in the arm of an automobile First quarter out is the last quarter in. 1995 1996 1998 1998 1982 1982 1995 1995 Initial State After One Push After Three More Pushes After One Pop

A Hardware Implementation Data items move between registers Empty: Yes Empty: No Empty: No Empty: No / / / / / / TOP #18 TOP #12 TOP #31 TOP / / / / / / / / / / / / #5 #18 / / / / / / / / / / / / #31 / / / / / / / / / / / / / / / / / / #18 / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / Initial State After One Push After Three More Pushes After Two Pops

A Software Implementation Data items don't move in memory, just our idea about there the TOP of the stack is. / / / / / / / / / / / / #12 TOP #12 / / / / / / / / / / / / #5 #5 / / / / / / / / / / / / #31 #31 TOP / / / / / / #18 TOP #18 #18 / / / / / / TOP / / / / / / / / / / / / / / / / / / x4000 R6 x3FFF R6 x3FFC R6 x3FFE R6 Initial State After One Push After Three More Pushes After Two Pops By convention, R6 holds the Top of Stack (TOS) pointer.

Basic Push and Pop Code For our implementation, stack grows downward (when item added, TOS moves closer to 0) Push ADD R6, R6, #-1 ; decrement stack ptr STR R0, R6, #0 ; store data (R0) Pop LDR R0, R6, #0 ; load data from TOS ADD R6, R6, #1 ; decrement stack ptr

Pop with Underflow Detection If we try to pop too many items off the stack, an underflow condition occurs. Check for underflow by checking TOS before removing data. Return status code in R5 (0 for success, 1 for underflow) POP LD R1, EMPTY ; EMPTY = -x4000 ADD R2, R6, R1 ; Compare stack pointer BRz FAIL ; with x3FFF LDR R0, R6, #0 ADD R6, R6, #1 AND R5, R5, #0 ; SUCCESS: R5 = 0 RET FAIL AND R5, R5, #0 ; FAIL: R5 = 1 ADD R5, R5, #1 RET EMPTY .FILL xC000

Push with Overflow Detection If we try to push too many items onto the stack, an overflow condition occurs. Check for underflow by checking TOS before adding data. Return status code in R5 (0 for success, 1 for overflow) PUSH LD R1, MAX ; MAX = -x3FFB ADD R2, R6, R1 ; Compare stack pointer BRz FAIL ; with x3FFF ADD R6, R6, #-1 STR R0, R6, #0 AND R5, R5, #0 ; SUCCESS: R5 = 0 RET FAIL AND R5, R5, #0 ; FAIL: R5 = 1 ADD R5, R5, #1 RET MAX .FILL xC005

Interrupt-Driven I/O (Part 2) Interrupts were introduced in Chapter 8. External device signals need to be serviced. Processor saves state and starts service routine. When finished, processor restores state and resumes program. Chapter 8 didn’t explain how (2) and (3) occur, because it involves a stack. Now, we’re ready… Interrupt is an unscripted subroutine call, triggered by an external event.

Processor State What state is needed to completely capture the state of a running process? Processor Status Register Privilege [15], Priority Level [10:8], Condition Codes [2:0] Program Counter Pointer to next instruction to be executed. Registers All temporary state of the process that’s not stored in memory.

Where to Save Processor State? Can’t use registers. Programmer doesn’t know when interrupt might occur, so she can’t prepare by saving critical registers. When resuming, need to restore state exactly as it was. Memory allocated by service routine? Must save state before invoking routine, so we wouldn’t know where. Also, interrupts may be nested – that is, an interrupt service routine might also get interrupted! Use a stack! Location of stack “hard-wired”. Push state to save, pop to restore.

Supervisor Stack A special region of memory used as the stack for interrupt service routines. Initial Supervisor Stack Pointer (SSP) stored in Saved.SSP. Another register for storing User Stack Pointer (USP): Saved.USP. Want to use R6 as stack pointer. So that our PUSH/POP routines still work. When switching from User mode to Supervisor mode (as result of interrupt), save R6 to Saved.USP.

Invoking the Service Routine – The Details If Priv = 1 (user), Saved.USP = R6, then R6 = Saved.SSP. Push PSR and PC to Supervisor Stack. Set PSR[15] = 0 (supervisor mode). Set PSR[10:8] = priority of interrupt being serviced. Set PSR[2:0] = 0. Set MAR = x01vv, where vv = 8-bit interrupt vector provided by interrupting device (e.g., keyboard = x80). Load memory location (M[x01vv]) into MDR. Set PC = MDR; now first instruction of ISR will be fetched. Note: This all happens between the STORE RESULT of the last user instruction and the FETCH of the first ISR instruction.

Returning from Interrupt Special instruction – RTI – that restores state. Pop PC from supervisor stack. (PC = M[R6]; R6 = R6 + 1) Pop PSR from supervisor stack. (PSR = M[R6]; R6 = R6 + 1) If PSR[15] = 1, R6 = Saved.USP. (If going back to user mode, need to restore User Stack Pointer.) RTI is a privileged instruction. Can only be executed in Supervisor Mode. If executed in User Mode, causes an exception. (More about that later.)

Executing ADD at location x3006 when Device B interrupts. Example (1) Program A Saved.SSP / / / / / / / / / / / / x3006 ADD / / / / / / / / / / / / / / / / / / PC x3006 Executing ADD at location x3006 when Device B interrupts.

Saved.USP = R6. R6 = Saved.SSP. Example (2) Program A ISR for Device B x6200 / / / / / / / / / / / / x3006 ADD R6 x3007 PSR for A x6210 RTI / / / / / / PC x6200 Saved.USP = R6. R6 = Saved.SSP. Push PSR and PC onto stack, then transfer to Device B service routine (at x6200).

Executing AND at x6202 when Device C interrupts. Example (3) Program A ISR for Device B x6200 / / / / / / x6202 AND / / / / / / x3006 ADD R6 x3007 PSR for A x6210 RTI / / / / / / PC x6203 Executing AND at x6202 when Device C interrupts.

Example (4) R6 x6203 AND ADD x3007 RTI / / / / / / PC x6300 RTI Program A ISR for Device B x6200 R6 x6203 x6202 AND PSR for B x3006 ADD ISR for Device C x3007 PSR for A x6210 RTI x6300 / / / / / / PC x6300 x6315 RTI Push PSR and PC onto stack, then transfer to Device C service routine (at x6300).

Execute RTI at x6315; pop PC and PSR from stack. Example (5) Program A ISR for Device B x6200 x6203 x6202 AND PSR for B x3006 ADD ISR for Device C R6 x3007 PSR for A x6210 RTI x6300 / / / / / / PC x6203 x6315 RTI Execute RTI at x6315; pop PC and PSR from stack.

Example (6) Saved.SSP x6203 AND ADD x3007 RTI / / / / / / PC x3007 RTI Program A ISR for Device B Saved.SSP x6200 x6203 x6202 AND PSR for B x3006 ADD ISR for Device C x3007 PSR for A x6210 RTI x6300 / / / / / / PC x3007 x6315 RTI Execute RTI at x6210; pop PSR and PC from stack. Restore R6. Continue Program A as if nothing happened.

Exception: Internal Interrupt When something unexpected happens inside the processor, it may cause an exception. Examples: Privileged operation (e.g., RTI in user mode) Executing an illegal opcode Divide by zero Accessing an illegal address (e.g., protected system memory) Handled just like an interrupt Vector is determined internally by type of exception Priority is the same as running program

Arithmetic Using a Stack Instead of registers, some ISA's use a stack for source and destination operations: a zero-address machine. Example: ADD instruction pops two numbers from the stack, adds them, and pushes the result to the stack. Evaluating (A+B)·(C+D) using a stack: (1) push A (2) push B (3) ADD (4) push C (5) push D (6) ADD (7) MULTIPLY (8) pop result Why use a stack? Limited registers. Convenient calling convention for subroutines. Algorithm naturally expressed using FIFO data structure.

Example: OpAdd POP two values, ADD, then PUSH result.

Example: OpAdd OpAdd JSR POP ; Get first operand. ADD R5,R5,#0 ; Check for POP success. BRp Exit ; If error, bail. ADD R1,R0,#0 ; Make room for second. JSR POP ; Get second operand. ADD R5,R5,#0 ; Check for POP success. BRp Restore1 ; If err, restore & bail. ADD R0,R0,R1 ; Compute sum. JSR RangeCheck ; Check size. BRp Restore2 ; If err, restore & bail. JSR PUSH ; Push sum onto stack. RET Restore2 ADD R6,R6,#-1 ; Decr stack ptr (undo POP) Restore1 ADD R6,R6,#-1 ; Decr stack ptr Exit RET

Data Type Conversion Keyboard input routines read ASCII characters, not binary values. Similarly, output routines write ASCII. Consider this program: TRAP x23 ; input from keybd ADD R1, R0, #0 ; move to R1 TRAP x23 ; input from keybd ADD R0, R1, R0 ; add two inputs TRAP x21 ; display result TRAP x25 ; HALT User inputs 2 and 3 -- what happens? Result displayed: e Why? ASCII '2' (x32) + ASCII '3' (x33) = ASCII 'e' (x65)

ASCII to Binary Useful to deal with mult-digit decimal numbers Assume we've read three ASCII digits (e.g., "259") into a memory buffer. How do we convert this to a number we can use? Convert first character to digit (subtract x30) and multiply by 100. Convert second character to digit and multiply by 10. Convert third character to digit. Add the three digits together. x32 '2' x35 '5' x39 '9'

Multiplication via a Lookup Table How can we multiply a number by 100? One approach: Add number to itself 100 times. Another approach: Add 100 to itself <number> times. (Better if number < 100.) Since we have a small range of numbers (0-9), use number as an index into a lookup table. Entry 0: 0 x 100 = 0 Entry 1: 1 x 100 = 100 Entry 2: 2 x 100 = 200 Entry 3: 3 x 100 = 300 etc.

Code for Lookup Table ; multiply R0 by 100, using lookup table ; LEA R1, Lookup100 ; R1 = table base ADD R1, R1, R0 ; add index (R0) LDR R0, R1, #0 ; load from M[R1] ... Lookup100 .FILL 0 ; entry 0 .FILL 100 ; entry 1 .FILL 200 ; entry 2 .FILL 300 ; entry 3 .FILL 400 ; entry 4 .FILL 500 ; entry 5 .FILL 600 ; entry 6 .FILL 700 ; entry 7 .FILL 800 ; entry 8 .FILL 900 ; entry 9

Complete Conversion Routine (1 of 3) ; Three-digit buffer at ASCIIBUF. ; R1 tells how many digits to convert. ; Put resulting decimal number in R0. ASCIItoBinary AND R0, R0, #0 ; clear result ADD R1, R1, #0 ; test # digits BRz DoneAtoB ; done if no digits ; LD R3, NegZero ; R3 = -x30 LEA R2, ASCIIBUF ADD R2, R2, R1 ADD R2, R2, #-1 ; points to ones digit ; LDR R4, R2, #0 ; load digit ADD R4, R4, R3 ; convert to number ADD R0, R0, R4 ; add ones contrib

Conversion Routine (2 of 3) ADD R1, R1, #-1 ; one less digit BRz DoneAtoB ; done if zero ADD R2, R2, #-1 ; points to tens digit ; LDR R4, R2, #0 ; load digit ADD R4, R4, R3 ; convert to number LEA R5, Lookup10 ; multiply by 10 ADD R5, R5, R4 LDR R4, R5, #0 ADD R0, R0, R4 ; adds tens contrib ; ADD R1, R1, #-1 ; one less digit BRz DoneAtoB ; done if zero ADD R2, R2, #-1 ; points to hundreds ; digit

Conversion Routine (3 of 3) LDR R4, R2, #0 ; load digit ADD R4, R4, R3 ; convert to number LEA R5, Lookup100 ; multiply by 100 ADD R5, R5, R4 LDR R4, R5, #0 ADD R0, R0, R4 ; adds 100's contrib ; DoneAtoB RET NegZero .FILL xFFD0 ; -x30 ASCIIBUF .BLKW 4 Lookup10 .FILL 0 .FILL 10 .FILL 20 ... Lookup100 .FILL 0 .FILL 100 ...

Binary to ASCII Conversion Converting a 2's complement binary value to a three-digit decimal number Resulting characters can be output using OUT Instead of multiplying, we need to divide by 100 to get hundreds digit. Why wouldn't we use a lookup table for this problem? Subtract 100 repeatedly from number to divide. First, check whether number is negative. Write sign character (+ or -) to buffer and make positive.

Binary to ASCII Conversion Code (part 1 of 3) ; R0 is between -999 and +999. ; Put sign character in ASCIIBUF, followed by three ; ASCII digit characters. BinaryToASCII LEA R1, ASCIIBUF ; pt to result string ADD R0, R0, #0 ; test sign of value BRn NegSign LD R2, ASCIIplus ; store '+' STR R2, R1, #0 BRnzp Begin100 NegSign LD R2, ASCIIneg ; store '-' STR R2, R1, #0 NOT R0, R0 ; convert value to pos ADD R0, R0, #1

Conversion (2 of 3) Begin100 LD R2, ASCIIoffset LD R3, Neg100 Loop100 ADD R0, R0, R3 BRn End100 ADD R2, R2, #1 ; add one to digit BRnzp Loop100 End100 STR R2, R1, #1 ; store ASCII 100's digit LD R3, Pos100 ADD R0, R0, R3 ; restore last subtract ; LD R2, ASCIIoffset LD R3, Neg10 Loop100 ADD R0, R0, R3 BRn End10 ADD R2, R2, #1 ; add one to digit BRnzp Loop10

Conversion Code (3 of 3) End10 STR R2, R1, #2 ; store ASCII 10's digit ADD R0, R0, #10 ; restore last subtract ; LD R2, ASCIIoffset ADD R2, R2, R0 ; convert one's digit STR R2, R1, #3 ; store one's digit RET ; ASCIIplus .FILL x2B ; plus sign ASCIIneg .FILL x2D ; neg sign ASCIIoffset .FILL x30 ; zero Neg100 .FILL xFF9C ; -100 Pos100 .FILL #100 Neg10 .FILL xFFF6 ; -10

Chapter 11 Introduction to Programming in C

C: A High-Level Language Gives symbolic names to values don’t need to know which register or memory location Provides abstraction of underlying hardware operations do not depend on instruction set example: can write “a = b * c”, even though LC-3 doesn’t have a multiply instruction Provides expressiveness use meaningful symbols that convey meaning simple expressions for common control patterns (if-then-else) Enhances code readability Safeguards against bugs can enforce rules or conditions at compile-time or run-time

Compilation vs. Interpretation Different ways of translating high-level language Interpretation interpreter = program that executes program statements generally one line/command at a time limited processing easy to debug, make changes, view intermediate results languages: BASIC, LISP, Perl, Java, Matlab, C-shell Compilation translates statements into machine language does not execute, but creates executable program performs optimization over multiple statements change requires recompilation can be harder to debug, since executed code may be different languages: C, C++, Fortran, Pascal

Compilation vs. Interpretation Consider the following algorithm: Get W from the keyboard. X = W + W Y = X + X Z = Y + Y Print Z to screen. If interpreting, how many arithmetic operations occur? If compiling, we can analyze the entire program and possibly reduce the number of operations. Can we simplify the above algorithm to use a single arithmetic operation?

Compiling a C Program Entire mechanism is usually called the “compiler” Preprocessor macro substitution conditional compilation “source-level” transformations output is still C Compiler generates object file machine instructions Linker combine object files (including libraries) into executable image

Compiler Source Code Analysis Code Generation Symbol Table “front end” parses programs to identify its pieces variables, expressions, statements, functions, etc. depends on language (not on target machine) Code Generation “back end” generates machine code from analyzed source may optimize machine code to make it run more efficiently very dependent on target machine Symbol Table map between symbolic names and items like assembler, but more kinds of information

A Simple C Program #include <stdio.h> #define STOP 0 /* Function: main */ /* Description: counts down from user input to STOP */ main() { /* variable declarations */ int counter; /* an integer to hold count values */ int startPoint; /* starting point for countdown */ /* prompt user for input */ printf("Enter a positive number: "); scanf("%d", &startPoint); /* read into startPoint */ /* count down and print count */ for (counter=startPoint; counter >= STOP; counter--) printf("%d\n", counter); }

Preprocessor Directives #include <stdio.h> Before compiling, copy contents of header file (stdio.h) into source code. Header files typically contain descriptions of functions and variables needed by the program. no restrictions -- could be any C source code #define STOP 0 Before compiling, replace all instances of the string "STOP" with the string "0" Called a macro Used for values that won't change during execution, but might change if the program is reused. (Must recompile.)

Comments Begins with /* and ends with */ Can span multiple lines Cannot have a comment within a comment Comments are not recognized within a string example: "my/*don't print this*/string" would be printed as: my/*don't print this*/string As before, use comments to help reader, not to confuse or to restate the obvious

main Function Every C program must have a function called main(). This is the code that is executed when the program is run. The code for the function lives within brackets: main() { /* code goes here */ }

Variable Declarations Variables are used as names for data items. Each variable has a type, which tells the compiler how the data is to be interpreted (and how much space it needs, etc.). int counter; int startPoint; int is a predefined integer type in C.

Input and Output Variety of I/O functions in C Standard Library. Must include <stdio.h> to use them. printf("%d\n", counter); String contains characters to print and formatting directions for variables. This call says to print the variable counter as a decimal integer, followed by a linefeed (\n). scanf("%d", &startPoint); String contains formatting directions for looking at input. This call says to read a decimal integer and assign it to the variable startPoint. (Don't worry about the & yet.)

More About Output Can print arbitrary expressions, not just variables printf("%d\n", startPoint - counter); Print multiple expressions with a single statement printf("%d %d\n", counter, startPoint - counter); Different formatting options: %d decimal integer %x hexadecimal integer %c ASCII character %f floating-point number

Examples This code: produces this output: 43 is a prime number. printf("%d is a prime number.\n", 43); printf("43 plus 59 in decimal is %d.\n", 43+59); printf("43 plus 59 in hex is %x.\n", 43+59); printf("43 plus 59 as a character is %c.\n", 43+59); produces this output: 43 is a prime number. 43 + 59 in decimal is 102. 43 + 59 in hex is 66. 43 + 59 as a character is f.

Examples of Input Many of the same formatting characters are available for user input. scanf("%c", &nextChar); reads a single character and stores it in nextChar scanf("%f", &radius); reads a floating point number and stores it in radius scanf("%d %d", &length, &width); reads two decimal integers (separated by whitespace), stores the first one in length and the second in width Must use ampersand (&) for variables being modified. (Explained in Chapter 16.)

Compiling and Linking Various compilers available cc, gcc includes preprocessor, compiler, and linker Lots and lots of options! level of optimization, debugging preprocessor, linker options intermediate files -- object (.o), assembler (.s), preprocessor (.i), etc.

Remaining Chapters A more detailed look at many C features. Variables and declarations Operators Control Structures Functions Data Structures I/O Emphasis on how C is converted to LC-3 assembly language. Also see C Reference in Appendix D.

Chapter 12 Variables and Operators

Basic C Elements Variables Operators named, typed data items Operators predefined actions performed on data items combined with variables to form expressions, statements Rules and usage Implementation using LC-3

Data Types C has three basic data types int integer (at least 16 bits) double floating point (at least 32 bits) char character (at least 8 bits) Exact size can vary, depending on processor int is supposed to be "natural" integer size; for LC-3, that's 16 bits -- 32 bits for most modern processors

Variable Names Any combination of letters, numbers, and underscore (_) Case matters "sum" is different than "Sum" Cannot begin with a number usually, variables beginning with underscore are used only in special library routines Only first 31 characters are used

Examples Legal i wordsPerSecond words_per_second _green aReally_longName_moreThan31chars aReally_longName_moreThan31characters Illegal 10sdigit ten'sdigit done? double same identifier reserved keyword

Literals Integer 123 /* decimal */ -123 0x123 /* hexadecimal */ Floating point 6.023 6.023e23 /* 6.023 x 1023 */ 5E12 /* 5.0 x 1012 */ Character 'c' '\n' /* newline */ '\xA' /* ASCII 10 (0xA) */

Scope: Global and Local Where is the variable accessible? Global: accessed anywhere in program Local: only accessible in a particular region Compiler infers scope from where variable is declared programmer doesn't have to explicitly state Variable is local to the block in which it is declared block defined by open and closed braces { } can access variable declared in any "containing" block Global variable is declared outside all blocks

Example Output Global 0 Local 1 Global 4 Local 2 Global 4 Local 1 #include <stdio.h> int itsGlobal = 0; main() { int itsLocal = 1; /* local to main */ printf("Global %d Local %d\n", itsGlobal, itsLocal); int itsLocal = 2; /* local to this block */ itsGlobal = 4; /* change global variable */ } Output Global 0 Local 1 Global 4 Local 2 Global 4 Local 1

Operators Programmers manipulate variables using the operators provided by the high-level language. Variables and operators combine to form expressions and statements which denote the work to be done by the program. Each operator may correspond to many machine instructions. Example: The multiply operator (*) typically requires multiple LC-3 ADD instructions.

Expression Any combination of variables, constants, operators, and function calls every expression has a type, derived from the types of its components (according to C typing rules) Examples: counter >= STOP x + sqrt(y) x & z + 3 || 9 - w-- % 6

Statement Expresses a complete unit of work executed in sequential order Simple statement ends with semicolon z = x * y; /* assign product to z */ y = y + 1; /* after multiplication */ ; /* null statement */ Compound statement groups simple statements using braces. syntactically equivalent to a simple statement { z = x * y; y = y + 1; }

Operators Three things to know about each operator (1) Function what does it do? (2) Precedence in which order are operators combined? Example: "a * b + c * d" is the same as "(a * b) + (c * d)" because multiply (*) has a higher precedence than addition (+) (3) Associativity in which order are operators of the same precedence combined? Example: "a - b - c" is the same as "(a - b) - c" because add/sub associate left-to-right

Assignment Operator Changes the value of a variable. x = x + 4; 1. Evaluate right-hand side. 2. Set value of left-hand side variable to result.

Assignment Operator All expressions evaluate to a value, even ones with the assignment operator. For assignment, the result is the value assigned. usually (but not always) the value of the right-hand side type conversion might make assigned value different than computed value Assignment associates right to left. y = x = 3; y gets the value 3, because (x = 3) evaluates to the value 3.

Arithmetic Operators Symbol Operation Usage Precedence Assoc * multiply x * y 6 l-to-r / divide x / y 6 l-to-r % modulo x % y 6 l-to-r + addition x + y 7 l-to-r - subtraction x - y 7 l-to-r All associate left to right. * / % have higher precedence than + -.

Arithmetic Expressions If mixed types, smaller type is "promoted" to larger. x + 4.3 if x is int, converted to double and result is double Integer division -- fraction is dropped. x / 3 if x is int and x=5, result is 1 (not 1.666666...) Modulo -- result is remainder. x % 3 if x is int and x=5, result is 2.

Bitwise Operators Symbol Operation Usage Precedence Assoc ~ bitwise NOT ~x 4 r-to-l << left shift x << y 8 l-to-r >> right shift x >> y 8 l-to-r & bitwise AND x & y 11 l-to-r ^ bitwise XOR x ^ y 12 l-to-r | bitwise OR x | y 13 l-to-r Operate on variables bit-by-bit. Like LC-3 AND and NOT instructions. Shift operations are logical (not arithmetic). Operate on values -- neither operand is changed.

Logical Operators Symbol Operation Usage Precedence Assoc ! logical NOT !x 4 r-to-l && logical AND x && y 14 l-to-r || logical OR x || y 15 l-to-r Treats entire variable (or value) as TRUE (non-zero) or FALSE (zero). Result is 1 (TRUE) or 0 (FALSE).

Relational Operators Symbol Operation Usage Precedence Assoc > greater than x > y 9 l-to-r >= greater than or equal x >= y 9 l-to-r < less than x < y 9 l-to-r <= less than or equal x <= y 9 l-to-r == equal x == y 10 l-to-r != not equal x != y 10 l-to-r Result is 1 (TRUE) or 0 (FALSE). Note: Don't confuse equality (==) with assignment (=).

Special Operators: ++ and -- Changes value of variable before (or after) its value is used in an expression. Symbol Operation Usage Precedence Assoc ++ postincrement x++ 2 r-to-l -- postdecrement x-- 2 r-to-l ++ preincrement ++x 3 r-to-l <= predecrement --x 3 r-to-l Pre: Increment/decrement variable before using its value. Post: Increment/decrement variable after using its value.

Using ++ and -- x = 4; y = x++; Results: x = 5, y = 4 (because x is incremented after assignment) y = ++x; Results: x = 5, y = 5 (because x is incremented before assignment)

Practice with Precedence Assume a=1, b=2, c=3, d=4. x = a * b + c * d / 2; /* x = 8 */ same as: x = (a * b) + ((c * d) / 2); For long or confusing expressions, use parentheses, because reader might not have memorized precedence table. Note: Assignment operator has lowest precedence, so all the arithmetic operations on the right-hand side are evaluated first.

Symbol Table Like assembler, compiler needs to know information associated with identifiers in assembler, all identifiers were labels and information is address Compiler keeps more information Name (identifier) Type Location in memory Scope Name Type Offset Scope amount hours minutes rate seconds time int -3 -4 -1 -5 -2 main

Local Variable Storage Local variables are stored in an activation record, also known as a stack frame. Symbol table “offset” gives the distance from the base of the frame. R5 is the frame pointer – holds address of the base of the current frame. A new frame is pushed on the run-time stack each time a block is entered. Because stack grows downward, base is the highest address of the frame, and variable offsets are <= 0. seconds minutes hours time rate amount R5

Allocating Space for Variables Global data section All global variables stored here (actually all static variables) R4 points to beginning Run-time stack Used for local variables R6 points to top of stack R5 points to top frame on stack New frame for each block (goes away when block exited) Offset = distance from beginning of storage area Global: LDR R1, R4, #4 Local: LDR R2, R5, #-3 0x0000 instructions PC R4 global data R6 run-time stack R5 0xFFFF

Variables and Memory Locations In our examples, a variable is always stored in memory. When assigning to a variable, must store to memory location. A real compiler would perform code optimizations that try to keep variables allocated in registers. Why?

Example: Compiling to LC-3 #include <stdio.h> int inGlobal; main() { int inLocal; /* local to main */ int outLocalA; int outLocalB; /* initialize */ inLocal = 5; inGlobal = 3; /* perform calculations */ outLocalA = inLocal++ & ~inGlobal; outLocalB = (inLocal + inGlobal) - (inLocal - inGlobal); /* print results */ printf("The results are: outLocalA = %d, outLocalB = %d\n", outLocalA, outLocalB); }

Example: Symbol Table Name Type Offset Scope inGlobal int global global inLocal main outLocalA -1 outLocalB -2

Example: Code Generation ; main ; initialize variables AND R0, R0, #0 ADD R0, R0, #5 ; inLocal = 5 STR R0, R5, #0 ; (offset = 0) AND R0, R0, #0 ADD R0, R0, #3 ; inGlobal = 3 STR R0, R4, #0 ; (offset = 0)

Example (continued) ; first statement: ; outLocalA = inLocal++ & ~inGlobal; LDR R0, R5, #0 ; get inLocal ADD R1, R0, #1 ; increment STR R1, R5, #0 ; store LDR R1, R4, #0 ; get inGlobal NOT R1, R1 ; ~inGlobal AND R2, R0, R1 ; inLocal & ~inGlobal STR R2, R5, #-1 ; store in outLocalA ; (offset = -1)

Example (continued) ; next statement: ; outLocalB = (inLocal + inGlobal) ; - (inLocal - inGlobal); LDR R0, R5, #0 ; inLocal LDR R1, R4, #0 ; inGlobal ADD R0, R0, R1 ; R0 is sum LDR R2, R5, #0 ; inLocal LDR R3, R5, #0 ; inGlobal NOT R3, R3 ADD R3, R3, #1 ADD R2, R2, R3 ; R2 is difference NOT R2, R2 ; negate ADD R2, R2, #1 ADD R0, R0, R2 ; R0 = R0 - R2 STR R0, R5, #-2 ; outLocalB (offset = -2)

Special Operators: +=, *=, etc. Arithmetic and bitwise operators can be combined with assignment operator. Statement Equivalent assignment x += y; x = x + y; x -= y; x = x - y; x *= y; x = x * y; x /= y; x = x / y; x %= y; x = x % y; x &= y; x = x & y; x |= y; x = x | y; x ^= y; x = x ^ y; x <<= y; x = x << y; x >>= y; x = x >> y; All have same precedence and associativity as = and associate right-to-left.

Special Operator: Conditional Symbol Operation Usage Precedence Assoc ?: conditional x?y:z 16 l-to-r If x is TRUE (non-zero), result is y; else, result is z. Like a MUX, with x as the select signal. y z 1 x

Chapter 13 Control Structures

Control Structures Conditional Iteration making a decision about which code to execute, based on evaluated expression if if-else switch Iteration executing code multiple times, ending based on evaluated expression while for do-while

If if (condition) action; F condition T action Condition is a C expression, which evaluates to TRUE (non-zero) or FALSE (zero). Action is a C statement, which may be simple or compound (a block).

Example If Statements if (x <= 10) y = x * x + 5; if (x <= 10) { y = x * x + 5; z = (2 * y) / 3; } z = (2 * y) / 3; compound statement; both executed if x <= 10 only first statement is conditional; second statement is always executed

always true, so action is always executed! More If Examples if (0 <= age && age <= 11) kids += 1; if (month == 4 || month == 6 || month == 9 || month == 11) printf(“The month has 30 days.\n”); if (x = 2) y = 5; This is a common programming error (= instead of ==), not caught by compiler because it’s syntactically correct. always true, so action is always executed!

If’s Can Be Nested if (x == 3) if (y != 6) { z = z + 1; w = w + 2; } is the same as... if ((x == 3) && (y != 6)) { z = z + 1; w = w + 2; }

Generating Code for If Statement ; if (x == 2) y = 5; LDR R0, R5, #0 ; load x into R0 ADD R0, R0, #-2 ; subtract 2 BRnp NOT_TRUE ; if non-zero, x is not 2 AND R1, R1, #0 ; store 5 to y ADD R1, R1, #5 STR R1, R5, #-1 NOT_TRUE ... ; next statement

If-else if (condition) action_if; else action_else; condition T F Else allows choice between two mutually exclusive actions without re-testing condition.

Generating Code for If-Else LDR R0, R5, #0 BRz ELSE ; x is not zero LDR R1, R5, #-1 ; incr y ADD R1, R1, #1 STR R1, R5, #-1 LDR R1, R5, #02 ; decr z ADD R1, R1, #1 STR R1, R5, #-2 JMP DONE ; skip else code ; x is zero ELSE LDR R1, R5, #-1 ; decr y ADD R1, R1, #-1 STR R1, R5, #-1 LDR R1, R5, #-2 ; incr z ADD R1, R1, #1 STR R1, R5, #-2 DONE ... ; next statement if (x) { y++; z--; } else { y--; z++; }

Matching Else with If Else is always associated with closest unassociated if. if (x != 10) if (y > 3) z = z / 2; else z = z * 2; is the same as... is NOT the same as... if (x != 10) { if (y > 3) z = z / 2; else z = z * 2; } if (x != 10) { if (y > 3) z = z / 2; } else z = z * 2;

Chaining If’s and Else’s if (month == 4 || month == 6 || month == 9 || month == 11) printf(“Month has 30 days.\n”); else if (month == 1 || month == 3 || month == 5 || month == 7 || month == 8 || month == 10 || month == 12) printf(“Month has 31 days.\n”); else if (month == 2) printf(“Month has 28 or 29 days.\n”); else printf(“Don’t know that month.\n”);

While while (test) loop_body; F test T loop_body Executes loop body as long as test evaluates to TRUE (non-zero). Note: Test is evaluated before executing loop body.

Generating Code for While AND R0, R0, #0 STR R0, R5, #0 ; x = 0 ; test LOOP LDR R0, R5, #0 ; load x ADD R0, R0, #-10 BRzp DONE ; loop body LDR R0, R5, #0 ; load x ... <printf> ... ADD R0, R0, #1 ; incr x STR R0, R5, #0 JMP LOOP ; test again DONE ; next statement x = 0; while (x < 10) { printf(“%d ”, x); x = x + 1; }

Infinite Loops The following loop will never terminate: x = 0; while (x < 10) printf(“%d ”, x); Loop body does not change condition, so test never fails. This is a common programming error that can be difficult to find.

For for (init; end-test; re-init) statement init F test T loop_body Executes loop body as long as test evaluates to TRUE (non-zero). Initialization and re-initialization code includedin loop statement. Note: Test is evaluated before executing loop body. re-init

Generating Code for For ; init AND R0, R0, #0 STR R0, R5, #0 ; i = 0 ; test LOOP LDR R0, R5, #0 ; load i ADD R0, R0, #-10 BRzp DONE ; loop body LDR R0, R5, #0 ; load i ... <printf> ... ; re-init ADD R0, R0, #1 ; incr i STR R0, R5, #0 JMP LOOP ; test again DONE ; next statement for (i = 0; i < 10; i++) printf(“%d ”, i); This is the same as the while example!

Example For Loops /* -- what is the output of this loop? -- */ for (i = 0; i <= 10; i ++) printf("%d ", i); /* -- what does this one output? -- */ letter = 'a'; for (c = 0; c < 26; c++) printf("%c ", letter+c); /* -- what does this loop do? -- */ numberOfOnes = 0; for (bitNum = 0; bitNum < 16; bitNum++) { if (inputValue & (1 << bitNum)) numberOfOnes++; }

Nested Loops Loop body can (of course) be another loop. /* print a multiplication table */ for (mp1 = 0; mp1 < 10; mp1++) { for (mp2 = 0; mp2 < 10; mp2++) { printf(“%d\t”, mp1*mp2); } printf(“\n”); } Braces aren’t necessary, but they make the code easier to read.

Another Nested Loop The test for the inner loop depends on the counter variable of the outer loop. for (outer = 1; outer <= input; outer++) { for (inner = 0; inner < outer; inner++) { sum += inner; } }

For vs. While In general: For loop is preferred for counter-based loops. Explicit counter variable Easy to see how counter is modified each loop While loop is preferred for sentinel-based loops. Test checks for sentinel value. Either kind of loop can be expressed as the other, so it’s really a matter of style and readability.

Do-While do loop_body; loop_body while (test); test T F Executes loop body as long as test evaluates to TRUE (non-zero). Note: Test is evaluated after executing loop body.

Problem Solving in C Stepwise Refinement as covered in Chapter 6 ...but can stop refining at a higher level of abstraction. Same basic constructs Sequential -- C statements Conditional -- if-else, switch Iterative -- while, for, do-while

Problem 1: Calculating Pi Calculate  using its series expansion. User inputs number of terms. Start Evaluate Series Initialize Output Results Get Input Stop

Pi: 1st refinement Start Initialize for loop Get Input Evaluate Series iteration count Initialize for loop count<terms F Get Input T Evaluate Series Evaluate next term Output Results count = count+1 Stop

Pi: 2nd refinement if-else Initialize iteration count count is odd T F count<terms F subtract term add term T Evaluate next term add term count = count+1 if-else

Pi: Code for Evaluate Terms for (count=0; count < numOfTerms; count++) { if (count % 2) { /* odd term -- subtract */ pi -= 4.0 / (2 * count + 1); } else { /* even term -- add */ pi += 4.0 / (2 * count + 1); } Note: Code in text is slightly different, but this code corresponds to equation.

Pi: Complete Code #include <stdio.h> main() { double pi = 0.0; int numOfTerms, count; printf("Number of terms (must be 1 or larger) : "); scanf("%d", &numOfTerms); for (count=0; count < numOfTerms; count++) { if (count % 2) { pi -= 4.0 / (2 * count + 1); /* odd term -- subtract */ } else { pi += 4.0 / (2 * count + 1); /* even term -- add */ } printf("The approximate value of pi is %f\n", pi); }

Problem 2: Finding Prime Numbers Print all prime numbers less than 100. A number is prime if its only divisors are 1 and itself. All non-prime numbers less than 100 will have a divisor between 2 and 10. Start Initialize Print primes Stop

Primes: 1st refinement Start Stop Initialize num = 2 num < 100 F Print primes Print num if prime num = num + 1 Stop

Primes: 2nd refinement Initialize num = 2 Divide num by 2 through 10 no divisors? F T Print num if prime T Print num num = num + 1

Clear flag if num%divisor > 0 Primes: 3rd refinement Initialize divisor = 2 Divide num by 2 through 10 divisor <= 10 F no divisors? F T T Clear flag if num%divisor > 0 Print num divisor = divisor + 1

Primes: Using a Flag Variable To keep track of whether number was divisible, we use a "flag" variable. Set prime = TRUE, assuming that this number is prime. If any divisor divides number evenly, set prime = FALSE. Once it is set to FALSE, it stays FALSE. After all divisors are checked, number is prime if the flag variable is still TRUE. Use macros to help readability. #define TRUE 1 #define FALSE 0

Optimization: Could put a break here to avoid some work. Primes: Complete Code #include <stdio.h> #define TRUE 1 #define FALSE 0 main () { int num, divisor, prime; /* start with 2 and go up to 100 */ for (num = 2; num < 100; num ++ ) { prime = TRUE; /* assume num is prime */ /* test whether divisible by 2 through 10 */ for (divisor = 2; divisor <= 10; divisor++) if (((num % divisor) == 0) && (num != divisor)) prime = FALSE; /* not prime */ if (prime) /* if prime, print it */ printf("The number %d is prime\n", num); } } Optimization: Could put a break here to avoid some work. (Section 13.5.2)

Switch evaluate expression switch (expression) { case const1: action1; break; case const2: action2; break; default: action3; } = const1? action1 T F = const2? action2 T F action3 Alternative to long if-else chain. If break is not used, then case "falls through" to the next.

Switch Example /* same as month example for if-else */ switch (month) { case 4: case 6: case 9: case 11: printf(“Month has 30 days.\n”); break; case 1: case 3: /* some cases omitted for brevity...*/ printf(“Month has 31 days.\n”); break; case 2: printf(“Month has 28 or 29 days.\n”); break; default: printf(“Don’t know that month.\n”); }

If a is 2, prints “BC”. Otherwise, prints “C”. More About Switch Case expressions must be constant. case i: /* illegal if i is a variable */ If no break, then next case is also executed. switch (a) { case 1: printf(“A”); case 2: printf(“B”); default: printf(“C”); } If a is 1, prints “ABC”. If a is 2, prints “BC”. Otherwise, prints “C”.

Problem 3: Searching for Substring Have user type in a line of text (ending with linefeed) and print the number of occurrences of "the". Reading characters one at a time Use the getchar() function -- returns a single character. Don't need to store input string; look for substring as characters are being typed. Similar to state machine: based on characters seen, move toward success state or move back to start state. Switch statement is a good match to state machine.

Substring: State machine to flow chart read char no match other match = 0 T 't' if 't', match=1 matched 't' F other if 'h', match=2 if 't', match=1 else match=0 match = 1 T 't' 'h' matched 'th' F 't' if 'e', count++ and match = 0 if 't', match=1 else match=0 other match = 2 T 'e' F matched 'the' 't' other increment count

Substring: Code (Part 1) #include <stdio.h> main() { char key; /* input character from user */ int match = 0; /* keep track of characters matched */ int count = 0; /* number of substring matches */ /* Read character until newline is typed */ while ((key = getchar()) != '\n') { /* Action depends on number of matches so far */ switch (match) { case 0: /* starting - no matches yet */ if (key == 't') match = 1; break;

Substring: Code (Part 2) case 1: /* 't' has been matched */ if (key == 'h') match = 2; else if (key == 't') match = 1; else match = 0; break;

Substring: Code (Part 3) case 2: /* 'th' has been matched */ if (key == 'e') { count++; /* increment count */ match = 0; /* go to starting point */ } else if (key == 't') { match = 1; else match = 0; break; } } printf("Number of matches = %d\n", count); }

Break and Continue break; continue; used only in switch statement or iteration statement passes control out of the “smallest” (loop or switch) statement containing it to the statement immediately following usually used to exit a loop before terminating condition occurs (or to exit switch statement when case is done) continue; used only in iteration statement terminates the execution of the loop body for this iteration loop expression is evaluated to see whether another iteration should be performed if for loop, also executes the re-initializer

Example What does the following loop do? for (i = 0; i <= 20; i++) { if (i%2 == 0) continue; printf("%d ", i); } What would be an easier way to write this? What happens if break instead of continue?

Chapter 14 Functions

Function Smaller, simpler, subcomponent of program Provides abstraction hide low-level details give high-level structure to program, easier to understand overall program flow enables separable, independent development C functions zero or multiple arguments passed in single result returned (optional) return value is always a particular type In other languages, called procedures, subroutines, ...

Example of High-Level Structure main() { SetupBoard(); /* place pieces on board */ DetermineSides(); /* choose black/white */ /* Play game */ do { WhitesTurn(); BlacksTurn(); } while (NoOutcomeYet()); } Structure of program is evident, even without knowing implementation.

3. use return value in expression Functions in C Declaration (also called prototype) int Factorial(int n); Function call -- used in expression a = x + Factorial(f + g); type of return value name of function types of all arguments 1. evaluate arguments 2, execute function 3. use return value in expression

gives control back to calling function and returns value Function Definition State type, name, types of arguments must match function declaration give name to each argument (doesn't have to match declaration) int Factorial(int n) { int i; int result = 1; for (i = 1; i <= n; i++) result *= i; return result; } gives control back to calling function and returns value

Why Declaration? Since function definition also includes return and argument types, why is declaration needed? Use might be seen before definition. Compiler needs to know return and arg types and number of arguments. Definition might be in a different file, written by a different programmer. include a "header" file with function declarations only compile separately, link together to make executable

function call (invocation) Example double ValueInDollars(double amount, double rate); main() { ... dollars = ValueInDollars(francs, DOLLARS_PER_FRANC); printf("%f francs equals %f dollars.\n", francs, dollars); ... } double ValueInDollars(double amount, double rate) { return amount * rate; declaration function call (invocation) definition

Implementing Functions: Overview Activation record information about each function, including arguments and local variables stored on run-time stack Calling function Called function push new activation record copy values into arguments call function execute code put result in activation record pop activation record from stack return get result from stack

Run-Time Stack Recall that local variables are stored on the run-time stack in an activation record Frame pointer (R5) points to the beginning of a region of activation record that stores local variables for the current function When a new function is called, its activation record is pushed on the stack; when it returns, its activation record is popped off of the stack.

Run-Time Stack Before call During call After call Memory Memory Memory Watt R6 R6 R5 R5 main main main Before call During call After call

Activation Record int NoName(int a, int b) { int w, x, y; . . . return y; } y x w dynamic link return address return value a b locals R5 bookkeeping args Name Type Offset Scope a b w x y int 4 5 -1 -2 NoName

Activation Record Bookkeeping Return value space for value returned by function allocated even if function does not return a value Return address save pointer to next instruction in calling function convenient location to store R7 in case another function (JSR) is called Dynamic link caller’s frame pointer used to pop this activation record from stack

Example Function Call int Volta(int q, int r) { int k; int m; ... return k; } int Watt(int a) { int w; ... w = Volta(w,10); ... return w; }

Calling the Function w = Volta(w, 10); 25 10 q r w dyn link ret addr ret val a w = Volta(w, 10); ; push second arg AND R0, R0, #0 ADD R0, R0, #10 ADD R6, R6, #-1 STR R0, R6, #0 ; push first argument LDR R0, R5, #0 ADD R6, R6, #-1 STR R0, R6, #0 ; call subroutine JSR Volta new R6 R6 R5 xFD00 Note: Caller needs to know number and type of arguments, doesn't know about local variables.

Starting the Callee Function xFCFB x3100 25 10 m k dyn link ret addr ret val q r w a ; leave space for return value ADD R6, R6, #-1 ; push return address ADD R6, R6, #-1 STR R7, R6, #0 ; push dyn link (caller’s frame ptr) ADD R6, R6, #-1 STR R5, R6, #0 ; set new frame pointer ADD R5, R6, #-1 ; allocate space for locals ADD R6, R6, #-2 new R6 new R5 R6 R5 xFD00

Ending the Callee Function -43 217 xFCFB x3100 25 10 m k dyn link ret addr ret val q r w a return k; ; copy k into return value LDR R0, R5, #0 STR R0, R5, #3 ; pop local variables ADD R6, R5, #1 ; pop dynamic link (into R5) LDR R5, R6, #0 ADD R6, R6, #1 ; pop return addr (into R7) LDR R7, R6, #0 ADD R6, R6, #1 ; return control to caller RET R6 R5 new R6 new R5 xFD00

Resuming the Caller Function 217 25 10 ret val q r w dyn link ret addr a w = Volta(w,10); JSR Volta ; load return value (top of stack) LDR R0, R6, #0 ; perform assignment STR R0, R5, #0 ; pop return value ADD R6, R6, #1 ; pop arguments ADD R6, R6, #2 R6 new R6 R5 xFD00

Summary of LC-3 Function Call Implementation Caller pushes arguments (last to first). Caller invokes subroutine (JSR). Callee allocates return value, pushes R7 and R5. Callee allocates space for local variables. Callee executes function code. Callee stores result into return value slot. Callee pops local vars, pops R5, pops R7. Callee returns (JMP R7). Caller loads return value and pops arguments. Caller resumes computation…

Chapter 15 Debugging

Debugging with High Level Languages Same goals as low-level debugging Examine and set values in memory Execute portions of program Stop execution when (and where) desired Want debugging tools to operate on high-level language constructs Examine and set variables, not memory locations Trace and set breakpoints on statements and function calls, not instructions ...but also want access to low-level tools when needed

Types of Errors Syntactic Errors Semantic Errors Algorithmic Errors Input code is not legal Caught by compiler (or other translation mechanism) Semantic Errors Legal code, but not what programmer intended Not caught by compiler, because syntax is correct Algorithmic Errors Problem with the logic of the program Program does what programmer intended, but it doesn't solve the right problem

Syntactic Errors Common errors: missing semicolon or brace mis-spelled type in declaration One mistake can cause an avalanche of errors because compiler can't recover and gets confused main () { int i int j; for (i = 0; i <= 10; i++) { j = i * 7; printf("%d x 7 = %d\n", i, j); } } missing semicolon

missing braces, so printf not part of if Semantic Errors Common Errors Missing braces to group statements together Confusing assignment with equality Wrong assumptions about operator precedence, associativity Wrong limits on for-loop counter Uninitialized variables h main () { int i int j; for (i = 0; i <= 10; i++) j = i * 7; printf("%d x 7 = %d\n", i, j); } missing braces, so printf not part of if

Algorithmic Errors Design is wrong, so program does not solve the correct problem Difficult to find Program does what we intended Problem might not show up until many runs of program Maybe difficult to fix Have to redesign, may have large impact on program code Classic example: Y2K bug only allow 2 digits for year, assuming 19__

Debugging Techniques Ad-Hoc Source-Level Debugger Insert printf statements to track control flow and values Code explicitly checks for values out of expected range, etc. Advantage: No special debugging tools needed Disadvantages: Requires intimate knowledge of code and expected values Frequent re-compile and execute cycles Inserted code can be buggy Source-Level Debugger Examine and set variable values Tracing, breakpoints, single-stepping on source-code statements

Source-Level Debugger main window of Cygwin version of gdb

Source-Level Debugging Techniques Breakpoints Stop when a particular statement is reached Stop at entry or exit of a function Conditional breakpoints: Stop if a variable is equal to a specific value, etc. Watchpoints: Stop when a variable is set to a specific value Single-Stepping Execute one statement at a time Step "into" or step "over" function calls Step into: next statement is first inside function call Step over: execute function without stopping Step out: finish executing current function and stop on exit

Source-Level Debugging Techniques Displaying Values Show value consistent with declared type of variable Dereference pointers (variables that hold addresses) See Chapter 17 Inspect parts of a data structure See Chapters 19 and 17

Chapter 16 Pointers and Arrays

Pointers and Arrays We've seen examples of both of these in our LC-3 programs; now we'll see them in C. Pointer Address of a variable in memory Allows us to indirectly access variables in other words, we can talk about its address rather than its value Array A list of values arranged sequentially in memory Example: a list of telephone numbers Expression a[4] refers to the 5th element of the array a

Address vs. Value Sometimes we want to deal with the address of a memory location, rather than the value it contains. Recall example from Chapter 6: adding a column of numbers. R2 contains address of first location. Read value, add to sum, and increment R2 until all numbers have been processed. R2 is a pointer -- it contains the address of data we’re interested in. address value x3107 x3100 x2819 x3101 R2 x3100 x0110 x3102 x0310 x3103 x0100 x3104 x1110 x3105 x11B1 x3106 x0019 x3107

Another Need for Addresses Consider the following function that's supposed to swap the values of its arguments. void Swap(int firstVal, int secondVal) { int tempVal = firstVal; firstVal = secondVal; secondVal = tempVal; }

Executing the Swap Function before call after call These values changed... firstVal secondVal valueB valueA tempVal firstVal secondVal valueB valueA 3 4 3 4 Swap R6 R6 ...but these did not. main Swap needs addresses of variables outside its own activation record.

Pointers in C C lets us talk about and manipulate pointers as variables and in expressions. Declaration int *p; /* p is a pointer to an int */ A pointer in C is always a pointer to a particular data type: int*, double*, char*, etc. Operators *p -- returns the value pointed to by p &z -- returns the address of variable z

Example int i; int *ptr; i = 4; ptr = &i; *ptr = *ptr + 1; store the value 4 into the memory location associated with i store the address of i into the memory location associated with ptr read the contents of memory at the address stored in ptr store the result into memory at the address stored in ptr

Example: LC-3 Code ; i is 1st local (offset 0), ptr is 2nd (offset -1) AND R0, R0, #0 ; clear R0 ADD R0, R0, #4 ; put 4 in R0 STR R0, R5, #0 ; store in i ; ptr = &i; ADD R0, R5, #0 ; R0 = R5 + 0 (addr of i) STR R0, R5, #-1 ; store in ptr ; *ptr = *ptr + 1; LDR R0, R5, #-1 ; R0 = ptr LDR R1, R0, #0 ; load contents (*ptr) ADD R1, R1, #1 ; add one STR R1, R0, #0 ; store result where R0 points

Pointers as Arguments Passing a pointer into a function allows the function to read/change memory outside its activation record. void NewSwap(int *firstVal, int *secondVal) { int tempVal = *firstVal; *firstVal = *secondVal; *secondVal = tempVal; } Arguments are integer pointers. Caller passes addresses of variables that it wants function to change.

Passing Pointers to a Function main() wants to swap the values of valueA and valueB passes the addresses to NewSwap: NewSwap(&valueA, &valueB); Code for passing arguments: ADD R0, R5, #-1 ; addr of valueB ADD R6, R6, #-1 ; push STR R0, R6, #0 ADD R0, R5, #0 ; addr of valueA ADD R6, R6, #-1 ; push STR R0, R6, #0 xEFFA xEFF9 4 3 tempVal firstVal secondVal valueB valueA R6 R5 xEFFD

Code Using Pointers Inside the NewSwap routine ; int tempVal = *firstVal; LDR R0, R5, #4 ; R0=xEFFA LDR R1, R0, #0 ; R1=M[xEFFA]=3 STR R1, R5, #4 ; tempVal=3 ; *firstVal = *secondVal; LDR R1, R5, #5 ; R1=xEFF9 LDR R2, R1, #0 ; R1=M[xEFF9]=4 STR R2, R0, #0 ; M[xEFFA]=4 ; *secondVal = tempVal; LDR R2, R5, #0 ; R2=3 STR R2, R1, #0 ; M[xEFF9]=3 R6 3 xEFFA xEFF9 4 tempVal firstVal secondVal valueB valueA R5 xEFFD

Null Pointer Sometimes we want a pointer that points to nothing. In other words, we declare a pointer, but we’re not ready to actually point to something yet. int *p; p = NULL; /* p is a null pointer */ NULL is a predefined macro that contains a value that a non-null pointer should never hold. Often, NULL = 0, because Address 0 is not a legal address for most programs on most platforms.

Using Arguments for Results Pass address of variable where you want result stored useful for multiple results Example: return value via pointer return status code as function result This solves the mystery of why ‘&’ with argument to scanf: scanf("%d ", &dataIn); read a decimal integer and store in dataIn

Syntax for Pointer Operators Declaring a pointer type *var; type* var; Either of these work -- whitespace doesn't matter. Type of variable is int* (integer pointer), char* (char pointer), etc. Creating a pointer &var Must be applied to a memory object, such as a variable. In other words, &3 is not allowed. Dereferencing Can be applied to any expression. All of these are legal: *var contents of mem loc pointed to by var **var contents of mem loc pointed to by memory location pointed to by var *3 contents of memory location 3

Example using Pointers IntDivide performs both integer division and remainder, returning results via pointers. (Returns –1 if divide by zero.) int IntDivide(int x, int y, int *quoPtr, int *remPtr); main() { int dividend, divisor; /* numbers for divide op */ int quotient, remainer; /* results */ int error; /* ...code for dividend, divisor input removed... */ error = IntDivide(dividend, divisor, &quotient, &remainder); /* ...remaining code removed... */ }

C Code for IntDivide int IntDivide(int x, int y, int *quoPtr, int *remPtr) { if (y != 0) { *quoPtr = x / y; /* quotient in *quoPtr */ *remPtr = x % y; /* remainder in *remPtr */ return 0; } else return –1; }

int num0; int num1; int num2; int num3; Arrays How do we allocate a group of memory locations? character string table of numbers How about this? Not too bad, but… what if there are 100 numbers? how do we write a loop to process each number? Fortunately, C gives us a better way -- the array. int num[4]; Declares a sequence of four integers, referenced by: num[0], num[1], num[2], num[3]. int num0; int num1; int num2; int num3;

Array Syntax Declaration type variable[num_elements]; Array Reference variable[index]; all array elements are of the same type number of elements must be known at compile-time i-th element of array (starting with zero); no limit checking at compile-time or run-time

Array as a Local Variable Array elements are allocated as part of the activation record. int grid[10]; First element (grid[0]) is at lowest address of allocated space. If grid is first variable allocated, then R5 will point to grid[9]. grid[0] grid[1] grid[2] grid[3] grid[4] grid[5] grid[6] grid[7] grid[8] grid[9]

LC-3 Code for Array References ; x = grid[3] + 1 ADD R0, R5, #-9 ; R0 = &grid[0] LDR R1, R0, #3 ; R1 = grid[3] ADD R1, R1, #1 ; plus 1 STR R1, R5, #-10 ; x = R1 ; grid[6] = 5; AND R0, R0, #0 ADD R0, R0, #5 ; R0 = 5 ADD R1, R5, #-9 ; R1 = &grid[0] STR R0, R1, #6 ; grid[6] = R0 x grid[0] grid[1] grid[2] grid[3] grid[4] grid[5] grid[6] grid[7] grid[8] grid[9] R5

More LC-3 Code ; grid[x+1] = grid[x] + 2 LDR R0, R5, #-10 ; R0 = x ADD R1, R5, #-9 ; R1 = &grid[0] ADD R1, R0, R1 ; R1 = &grid[x] LDR R2, R1, #0 ; R2 = grid[x] ADD R2, R2, #2 ; add 2 LDR R0, R5, #-10 ; R0 = x ADD R0, R0, #1 ; R0 = x+1 ADD R1, R5, #-9 ; R1 = &grid[0] ADD R1, R0, R1 ; R1 = &grix[x+1] STR R2, R1, #0 ; grid[x+1] = R2 x grid[0] grid[1] grid[2] grid[3] grid[4] grid[5] grid[6] grid[7] grid[8] grid[9] R5

Passing Arrays as Arguments C passes arrays by reference the address of the array (i.e., of the first element) is written to the function's activation record otherwise, would have to copy each element main() { int numbers[MAX_NUMS]; … mean = Average(numbers); … } int Average(int inputValues[MAX_NUMS]) { … for (index = 0; index < MAX_NUMS; index++) sum = sum + indexValues[index]; return (sum / MAX_NUMS); } This must be a constant, e.g., #define MAX_NUMS 10

A String is an Array of Characters Allocate space for a string just like any other array: char outputString[16]; Space for string must contain room for terminating zero. Special syntax for initializing a string: char outputString[16] = "Result = "; …which is the same as: outputString[0] = 'R'; outputString[1] = 'e'; outputString[2] = 's'; ...

I/O with Strings Printf and scanf use "%s" format character for string Printf -- print characters up to terminating zero printf("%s", outputString); Scanf -- read characters until whitespace, store result in string, and terminate with zero scanf("%s", inputString);

Relationship between Arrays and Pointers An array name is essentially a pointer to the first element in the array char word[10]; char *cptr; cptr = word; /* points to word[0] */ Difference: Can change the contents of cptr, as in cptr = cptr + 1; (The identifier "word" is not a variable.)

Correspondence between Ptr and Array Notation Given the declarations on the previous page, each line below gives three equivalent expressions: cptr word &word[0] (cptr + n) word + n &word[n] *cptr *word word[0] *(cptr + n) *(word + n) word[n]

Common Pitfalls with Arrays in C Overrun array limits There is no checking at run-time or compile-time to see whether reference is within array bounds. int array[10]; int i; for (i = 0; i <= 10; i++) array[i] = 0; Declaration with variable size Size of array must be known at compile time. void SomeFunction(int num_elements) { int temp[num_elements]; … }

allocates 20 words (2 per element) same as x[3] -- base address plus 6 Pointer Arithmetic Address calculations depend on size of elements In our LC-3 code, we've been assuming one word per element. e.g., to find 4th element, we add 4 to base address It's ok, because we've only shown code for int and char, both of which take up one word. If double, we'd have to add 8 to find address of 4th element. C does size calculations under the covers, depending on size of item being pointed to: double x[10]; double *y = x; *(y + 3) = 13; allocates 20 words (2 per element) same as x[3] -- base address plus 6

Chapter 17 Recursion

What is Recursion? A recursive function is one that solves its task by calling itself on smaller pieces of data. Similar to recurrence function in mathematics. Like iteration -- can be used interchangeably; sometimes recursion results in a simpler solution. Example: Running sum ( ) Mathematical Definition: RunningSum(1) = 1 RunningSum(n) = n + RunningSum(n-1) Recursive Function: int RunningSum(int n) { if (n == 1) return 1; else return n + RunningSum(n-1); }

Executing RunningSum res = RunningSum(4); return 4 + RunningSum(3); return value = 10 return 4 + RunningSum(3); RunningSum(3) return value = 6 return 3 + RunningSum(2); RunningSum(2) return value = 3 return 2 + RunningSum(1); RunningSum(1) return value = 1 return 1;

High-Level Example: Binary Search Given a sorted set of exams, in alphabetical order, find the exam for a particular student. 1. Look at the exam halfway through the pile. 2. If it matches the name, we're done; if it does not match, then... 3a. If the name is greater (alphabetically), then search the upper half of the stack. 3b. If the name is less than the halfway point, then search the lower half of the stack.

Binary Search: Pseudocode Pseudocode is a way to describe algorithms without completely coding them in C. FindExam(studentName, start, end) { halfwayPoint = (end + start)/2; if (end < start) ExamNotFound(); /* exam not in stack */ else if (studentName == NameOfExam(halfwayPoint)) ExamFound(halfwayPoint); /* found exam! */ else if (studentName < NameOfExam(halfwayPoint)) /* search lower half */ FindExam(studentName, start, halfwayPoint - 1); else /* search upper half */ FindExam(studentName, halfwayPoint + 1, end); }

High-Level Example: Towers of Hanoi Task: Move all disks from current post to another post. Rules: (1) Can only move one disk at a time. (2) A larger disk can never be placed on top of a smaller disk. (3) May use third post for temporary storage. Post 1 Post 2 Post 3

Task Decomposition Suppose disks start on Post 1, and target is Post 3. 1. Move top n-1 disks to Post 2. 2. Move largest disk to Post 3. 3. Move n-1 disks from Post 2 to Post 3. 1 2 3 1 2 3 1 2 3

Task Decomposition (cont.) Task 1 is really the same problem, with fewer disks and a different target post. "Move n-1 disks from Post 1 to Post 2." And Task 3 is also the same problem, with fewer disks and different starting and target posts. "Move n-1 disks from Post 2 to Post 3." So this is a recursive algorithm. The terminal case is moving the smallest disk -- can move directly without using third post. Number disks from 1 (smallest) to n (largest).

Towers of Hanoi: Pseudocode MoveDisk(diskNumber, startPost, endPost, midPost) { if (diskNumber > 1) { /* Move top n-1 disks to mid post */ MoveDisk(diskNumber-1, startPost, midPost, endPost); printf("Move disk number %d from %d to %d.\n", diskNumber, startPost, endPost); /* Move n-1 disks from mid post to end post */ MoveDisk(diskNumber-1, midPost, endPost, startPost); } else printf("Move disk number 1 from %d to %d.\n", startPost, endPost); }

Detailed Example: Fibonacci Numbers Mathematical Definition: In other words, the n-th Fibonacci number is the sum of the previous two Fibonacci numbers.

Fibonacci: C Code int Fibonacci(int n) { if ((n == 0) || (n == 1)) return 1; else return Fibonacci(n-1) + Fibonacci(n-2); }

Activation Records Whenever Fibonacci is invoked, a new activation record is pushed onto the stack. main calls Fibonacci(3) Fibonacci(3) calls Fibonacci(2) Fibonacci(2) calls Fibonacci(1) R6 Fib(1) R6 Fib(2) Fib(2) R6 Fib(3) Fib(3) Fib(3) main main main

Activation Records (cont.) Fibonacci(1) returns, Fibonacci(2) calls Fibonacci(0) Fibonacci(2) returns, Fibonacci(3) calls Fibonacci(1) Fibonacci(3) returns R6 Fib(0) R6 Fib(2) Fib(1) Fib(3) Fib(3) R6 main main main

Tracing the Function Calls If we are debugging this program, we might want to trace all the calls of Fibonacci. Note: A trace will also contain the arguments passed into the function. For Fibonacci(3), a trace looks like: Fibonacci(3) Fibonacci(2) Fibonacci(1) Fibonacci(0) Fibonacci(1) What would trace of Fibonacci(4) look like?

Fibonacci: LC-3 Code Activation Record temp dynamic link return address return value n local bookkeeping Compiler generates temporary variable to hold result of first Fibonacci call. arg

LC-2 Code (part 1 of 3) Fibonacci ADD R6, R6, #-2 ; skip ret val, push ret addr STR R7, R6, #0 ADD R6, R6, #-1 ; push dynamic link STR R5, R6, #0 ADD R5, R6, #-1 ; set frame pointer ADD R6, R6, #-2 ; space for locals and temps LDR R0, R5, #4 ; load n BRz FIB_BASE ; check for terminal cases ADD R0, R0, #-1 BRz FIB_BASE

LC-3 Code (part 2 of 3) LDR R0, R5, #4 ; read parameter n ADD R0, R0, #-1 ; calculate n-1 ADD R6, R6, #-1 ; push n-1 STR R0, R6, #0 JSR Fibonacci ; call self LDR R0, R6, #0 ; pop return value ADD R6, R6, #1 STR R0, R5, #-1 ; store in temp LDR R0, R5, #4 ; read parameter n ADD R0, R0, #-2 ; calculate n-2 ADD R6, R6, #-1 ; push n-2 STR R0, R6, #0 JSR Fibonacci ; call self

LC-3 Code (part 3 of 3) LDR R0, R6, #0 ; pop return value ADD R6, R6, #1 LDR R1, R5, #-1 ; read temp ADD R0, R0, R1 ; Fibonacci(n-1) + Fibonacci(n-2) BRnzp FIB_END ; all done FIB_BASE AND R0, R0, #0 ; base case – return 1 ADD R0, R0, #1 FIB_END STR R0, R5, #3 ; write return value (R0) ADD R6, R5, #1 ; pop local variables LDR R5, R6, #0 ; pop dynamic link ADD R6, R6, #1 LDR R7, R6, #0 ; pop return address ADD R6, R6, #1 RET

A Final C Example: Printing an Integer Recursively converts an unsigned integer as a string of ASCII characters. If integer <10, convert to char and print. Else, call self on first (n-1) digits and then print last digit. void IntToAscii(int num) { int prefix, currDigit; if (num < 10) putchar(num + '0'); /* prints single char */ else { prefix = num / 10; /* shift right one digit */ IntToAscii(prefix); /* print shifted num */ /* then print shifted digit */ currDigit = num % 10; putchar(currDigit + '0'); } }

Trace of IntToAscii Calling IntToAscii with parameter 12345: IntToAscii(12345) IntToAscii(1234) IntToAscii(123) IntToAscii(12) IntToAscii(1) putchar('1') putchar('2') putchar('3') putchar('4') putchar('5')

Chapter 18 I/O in C

Standard C Library I/O commands are not included as part of the C language. Instead, they are part of the Standard C Library. A collection of functions and macros that must be implemented by any ANSI standard implementation. Automatically linked with every executable. Implementation depends on processor, operating system, etc., but interface is standard. Since they are not part of the language, compiler must be told about function interfaces. Standard header files are provided, which contain declarations of functions, variables, etc.

Basic I/O Functions The standard I/O functions are declared in the <stdio.h> header file. Function Description putchar Displays an ASCII character to the screen. getchar Reads an ASCII character from the keyboard. printf Displays a formatted string, scanf Reads a formatted string. fopen Open/create a file for I/O. fprintf Writes a formatted string to a file. fscanf Reads a formatted string from a file.

Text Streams All character-based I/O in C is performed on text streams. A stream is a sequence of ASCII characters, such as: the sequence of ASCII characters printed to the monitor by a single program the sequence of ASCII characters entered by the user during a single program the sequence of ASCII characters in a single file Characters are processed in the order in which they were added to the stream. E.g., a program sees input characters in the same order as the user typed them. Standard input stream (keyboard) is called stdin. Standard output stream (monitor) is called stdout.

Each of these calls prints 'h' to the screen. Character I/O putchar(c) Adds one ASCII character (c) to stdout. getchar() Reads one ASCII character from stdin. These functions deal with "raw" ASCII characters; no type conversion is performed. char c = 'h'; ... putchar(c); putchar('h'); putchar(104); Each of these calls prints 'h' to the screen.

Buffered I/O In many systems, characters are buffered in memory during an I/O operation. Conceptually, each I/O stream has its own buffer. Keyboard input stream Characters are added to the buffer only when the newline character (i.e., the "Enter" key) is pressed. This allows user to correct input before confirming with Enter. Output stream Characters are not flushed to the output device until the newline character is added.

Input Buffering printf("Input character 1:\n"); inChar1 = getchar(); printf("Input character 2:\n"); inChar2 = getchar(); After seeing the first prompt and typing a single character, nothing happens. Expect to see the second prompt, but character not added to stdin until Enter is pressed. When Enter is pressed, newline is added to stream and is consumed by second getchar(), so inChar2 is set to'\n'.

Output Buffering putchar('a'); /* generate some delay */ for (i=0; i<DELAY; i++) sum += i; putchar('b'); putchar('\n'); User doesn't see any character output until after the delay. 'a' is added to the stream before the delay, but the stream is not flushed (displayed) until '\n' is added.

Formatted I/O Printf and scanf allow conversion between ASCII representations and internal data types. Format string contains text to be read/written, and formatting characters that describe how data is to be read/written. %d signed decimal integer %f signed decimal floating-point number %x unsigned hexadecimal number %b unsigned binary number %c ASCII character %s ASCII string

Special Character Literals Certain characters cannot be easily represented by a single keystroke, because they correspond to whitespace (newline, tab, backspace, ...) are used as delimiters for other literals (quote, double quote, ...) These are represented by the following sequences: \n newline \t tab \b backspace \\ backslash \' single quote \" double quote \0nnn ASCII code nnn (in octal) \xnnn ASCII code nnn (in hex)

printf Prints its first argument (format string) to stdout with all formatting characters replaced by the ASCII representation of the corresponding data argument. int a = 100; int b = 65; char c = 'z'; char banner[10] = "Hola!"; double pi = 3.14159; printf("The variable 'a' decimal: %d\n", a); printf("The variable 'a' hex: %x\n", a); printf("The variable 'a' binary: %b\n", a); printf("'a' plus 'b' as character: %c\n", a+b); printf("A char %c.\t A string %s\n A float %f\n", c, banner, pi);

Missing Data Arguments What happens when you don't provide a data argument for every formatting character? printf("The value of nothing is %d\n"); %d will convert and print whatever is on the stack in the position where it expects the first argument. In other words, something will be printed, but it will be a garbage value as far as our program is concerned.

scanf Reads ASCII characters from stdin, matching characters to its first argument (format string), converting character sequences according to any formatting characters, and storing the converted values to the addresses specified by its data pointer arguments. char name[100]; int bMonth, bDay, bYear; double gpa; scanf("%s %d/%d/%d %lf", name, &bMonth, &bDay, &bYear, &gpa);

scanf Conversion For each data conversion, scanf will skip whitespace characters and then read ASCII characters until it encounters the first character that should NOT be included in the converted value. %d Reads until first non-digit. %x Reads until first non-digit (in hex). %s Reads until first whitespace character. Literals in format string must match literals in the input stream. Data arguments must be pointers, because scanf stores the converted value to that memory address.

Doesn't match literal '/', so scanf quits after second conversion. scanf Return Value The scanf function returns an integer, which indicates the number of successful conversions performed. This lets the program check whether the input stream was in the proper format. Example: scanf("%s %d/%d/%d %lf", name, &bMonth, &bDay, &bYear, &gpa); Input Stream Return Value Mudd 02/16/69 3.02 5 Muss 02 16 69 3.02 2 Doesn't match literal '/', so scanf quits after second conversion.

Bad scanf Arguments Two problems with scanf data arguments 1. Not a pointer int n = 0; scanf("%d", n); Will use the value of the argument as an address. 2. Missing data argument scanf("%d"); Will get address from stack, where it expects to find first data argument. If you're lucky, program will crash because of trying to modify a restricted memory location (e.g., location 0). Otherwise, your program will just modify an arbitrary memory location, which can cause very unpredictable behavior.

Variable Argument Lists The number of arguments in a printf or scanf call depends on the number of data items being read or written. Declaration of printf (from stdio.h): int printf(const char*, ...); Recall calling sequence from Chapter 14 Parameters pushed onto stack from right to left. This stack-based calling convention allows for a variable number of arguments, and fixed arguments (which are named first) are always the same offset from the frame ptr.

File I/O For our purposes, a file is a sequence of ASCII characters stored on some device. Allows us to process large amounts of data without having to type it in each time or read it all on the screen as it scrolls by. Each file is associated with a stream. May be input stream or output stream (or both!). The type of a stream is a "file pointer", declared as: FILE *infile; The FILE type is defined in <stdio.h>.

fopen The fopen (pronounced "eff-open") function associates a physical file with a stream. FILE *fopen(char* name, char* mode); First argument: name The name of the physical file, or how to locate it on the storage device. This may be dependent on the underlying operating system. Second argument: mode How the file will be used: "r" -- read from the file "w" -- write, starting at the beginning of the file "a" -- write, starting at the end of the file (append)

fprintf and fscanf Once a file is opened, it can be read or written using fscanf() and fprintf(), respectively. These are just like scanf() and printf(), except an additional argument specifies a file pointer. fprintf(outfile, "The answer is %d\n", x); fscanf(infile, "%s %d/%d/%d %lf", name, &bMonth, &bDay, &bYear, &gpa);

Chapter 19 Data Structures

Data Structures A data structure is a particular organization of data in memory. We want to group related items together. We want to organize these data bundles in a way that is convenient to program and efficient to execute. An array is one kind of data structure. In this chapter, we look at two more: struct – directly supported by C linked list – built from struct and dynamic allocation

Structures in C A struct is a mechanism for grouping together related data items of different types. Recall that an array groups items of a single type. Example: We want to represent an airborne aircraft: char flightNum[7]; int altitude; int longitude; int latitude; int heading; double airSpeed; We can use a struct to group these data together for each plane.

Defining a Struct We first need to define a new type for the compiler and tell it what our struct looks like. struct flightType { char flightNum[7]; /* max 6 characters */ int altitude; /* in meters */ int longitude; /* in tenths of degrees */ int latitude; /* in tenths of degrees */ int heading; /* in tenths of degrees */ double airSpeed; /* in km/hr */ }; This tells the compiler how big our struct is and how the different data items (“members”) are laid out in memory. But it does not allocate any memory.

Declaring and Using a Struct To allocate memory for a struct, we declare a variable using our new data type. struct flightType plane; Memory is allocated, and we can access individual members of this variable: plane.airSpeed = 800.0; plane.altitude = 10000; A struct’s members are laid out in the order specified by the definition. plane.flightNum[0] plane.flightNum[6] plane.altitude plane.longitude plane.latitude plane.heading plane.airspeed

Defining and Declaring at Once You can both define and declare a struct at the same time. struct flightType { char flightNum[7]; /* max 6 characters */ int altitude; /* in meters */ int longitude; /* in tenths of degrees */ int latitude; /* in tenths of degrees */ int heading; /* in tenths of degrees */ double airSpeed; /* in km/hr */ } maverick; And you can use the flightType name to declare other structs. struct flightType iceMan;

typedef C provides a way to define a data type by giving a new name to a predefined type. Syntax: typedef <type> <name>; Examples: typedef int Color; typedef struct flightType WeatherData; typedef struct ab_type { int a; double b; } ABGroup;

Using typedef This gives us a way to make code more readable by giving application-specific names to types. Color pixels[500]; Flight plane1, plane2; Typical practice: Put typedef’s into a header file, and use type names in main program. If the definition of Color/Flight changes, you might not need to change the code in your main program file.

Generating Code for Structs Suppose our program starts out like this: int x; Flight plane; int y; plane.altitude = 0; ... LC-3 code for this assignment: AND R1, R1, #0 ADD R0, R5, #-13 ; R0=plane STR R1, R0, #7 ; 8th word y plane.flightNum[0] plane.flightNum[6] plane.altitude plane.longitude plane.latitude plane.heading plane.airspeed x R5

Array of Structs Can declare an array of structs: Flight planes[100]; Each array element is a struct (7 words, in this case). To access member of a particular element: planes[34].altitude = 10000; Because the [] and . operators are at the same precedence, and both associate left-to-right, this is the same as: (planes[34]).altitude = 10000;

Pointer to Struct We can declare and create a pointer to a struct: Flight *planePtr; planePtr = &planes[34]; To access a member of the struct addressed by dayPtr: (*planePtr).altitude = 10000; Because the . operator has higher precedence than *, this is NOT the same as: *planePtr.altitude = 10000; C provides special syntax for accessing a struct member through a pointer: planePtr->altitude = 10000;

Passing Structs as Arguments Unlike an array, a struct is always passed by value into a function. This means the struct members are copied to the function’s activation record, and changes inside the function are not reflected in the calling routine’s copy. Most of the time, you’ll want to pass a pointer to a struct. int Collide(Flight *planeA, Flight *planeB) { if (planeA->altitude == planeB->altitude) { ... } else return 0; }

Dynamic Allocation Suppose we want our weather program to handle a variable number of planes – as many as the user wants to enter. We can’t allocate an array, because we don’t know the maximum number of planes that might be required. Even if we do know the maximum number, it might be wasteful to allocate that much memory because most of the time only a few planes’ worth of data is needed. Solution: Allocate storage for data dynamically, as needed.

malloc The Standard C Library provides a function for allocating memory at run-time: malloc. void *malloc(int numBytes); It returns a generic pointer (void*) to a contiguous region of memory of the requested size (in bytes). The bytes are allocated from a region in memory called the heap. The run-time system keeps track of chunks of memory from the heap that have been allocated.

Using malloc To use malloc, we need to know how many bytes to allocate. The sizeof operator asks the compiler to calculate the size of a particular type. planes = malloc(n * sizeof(Flight)); We also need to change the type of the return value to the proper kind of pointer – this is called “casting.” planes = (Flight*) malloc(n* sizeof(Flight));

Example int airbornePlanes; Flight *planes; printf(“How many planes are in the air?”); scanf(“%d”, &airbornePlanes); planes = (Flight*) malloc(sizeof(Flight) * airbornePlanes); if (planes == NULL) { printf(“Error in allocating the data array.\n”); ... } planes[0].altitude = ... If allocation fails, malloc returns NULL. Note: Can use array notation or pointer notation.

free Once the data is no longer needed, it should be released back into the heap for later use. This is done using the free function, passing it the same address that was returned by malloc. void free(void*); If allocated data is not freed, the program might run out of heap memory and be unable to continue.

The Linked List Data Structure A linked list is an ordered collection of nodes, each of which contains some data, connected using pointers. Each node points to the next node in the list. The first node in the list is called the head. The last node in the list is called the tail. Node 0 Node 1 Node 2 NULL

Linked List vs. Array A linked list can only be accessed sequentially. To find the 5th element, for instance, you must start from the head and follow the links through four other nodes. Advantages of linked list: Dynamic size Easy to add additional nodes as needed Easy to add or remove nodes from the middle of the list (just add or redirect links) Advantage of array: Can easily and quickly access arbitrary elements

Example: Car Lot Create an inventory database for a used car lot. Support the following actions: Search the database for a particular vehicle. Add a new car to the database. Delete a car from the database. The database must remain sorted by vehicle ID. Since we don’t know how many cars might be on the lot at one time, we choose a linked list representation.

Car data structure Each car has the following characterics: vehicle ID, make, model, year, mileage, cost. Because it’s a linked list, we also need a pointer to the next node in the list: typedef struct carType Car; struct carType { int vehicleID; char make[20]; char model[20]; int year; int mileage; double cost; Car *next; /* ptr to next car in list */ }

Scanning the List Searching, adding, and deleting all require us to find a particular node in the list. We scan the list until we find a node whose ID is >= the one we’re looking for. Car *ScanList(Car *head, int searchID) { Car *previous, *current; previous = head; current = head->next; /* Traverse until ID >= searchID */ while ((current!=NULL) && (current->vehicleID < searchID)) { previous = current; current = current->next; } return previous; }

Adding a Node Create a new node with the proper info. Find the node (if any) with a greater vehicleID. “Splice” the new node into the list: new node Node 0 Node 1 Node 2 NULL

Excerpts from Code to Add a Node newNode = (Car*) malloc(sizeof(Car)); /* initialize node with new car info */ ... prevNode = ScanList(head, newNode->vehicleID); nextNode = prevNode->next; if ((nextNode == NULL) || (nextNode->vehicleID != newNode->vehicleID)) prevNode->next = newNode; newNode->next = nextNode; } else { printf(“Car already exists in database.”); free(newNode); }

Deleting a Node Find the node that points to the desired node. Redirect that node’s pointer to the next node (or NULL). Free the deleted node’s memory. Node 0 Node 1 Node 2 NULL

Excerpts from Code to Delete a Node printf(“Enter vehicle ID of car to delete:\n”); scanf(“%d”, vehicleID); prevNode = ScanList(head, vehicleID); delNode = prevNode->next; if ((delNode != NULL) && (delNode->vehicleID == vehicleID)) prevNode->next = delNode->next; free(delNode); } else { printf(“Vehicle not found in database.\n”); }

Building on Linked Lists The linked list is a fundamental data structure. Dynamic Easy to add and delete nodes The concepts described here will be helpful when learning about more elaborate data structures: Trees Hash Tables Directed Acyclic Graphs ...