Please copy and paste this embed script to where you want to embed

Jean-Pierre Deschamps · Elena Valderrama Lluís Terés

Digital Systems

From Logic Gates to Processors

Digital Systems

Jean-Pierre Deschamps • Elena Valderrama • Lluı´s Tere´s

Digital Systems From Logic Gates to Processors

Jean-Pierre Deschamps School of Engineering Rovira i Virgili University Tarragona, Spain

Elena Valderrama Escola d’Enginyeria Campus de la UAB Bellaterra, Spain

Lluı´s Tere´s Microelectronics Institute of Barcelona IMB-CNM (CSIC) Campus UAB-Bellaterra, Cerdanyola Barcelona, Spain

ISBN 978-3-319-41197-2 ISBN 978-3-319-41198-9 (eBook) DOI 10.1007/978-3-319-41198-9 Library of Congress Control Number: 2016947365 # Springer International Publishing Switzerland 2017 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Printed on acid-free paper This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG Switzerland

Preface

Digital electronic components are present in almost all our private and professional activities: • Our personal computers, our smartphones, or our tablets are made up of digital components such as microprocessors, memories, interface circuits, and so on. • Digital components are also present within our cars, our TV sets, or even in our household appliances. • They are essential components of practically any industrial production line. • They are also essential components of public transport systems, of secure access control systems, and many others. We could say that any activity involving: • The acquisition of data from human interfaces or from different types of sensors • The storage of data • The transmission of data • The processing of data • The use of data to control human interfaces or to control different types of actuators (e.g., mechanical actuators), can be performed in a safe and fast way by means of Digital Systems. Thus, nowadays digital systems constitute a basic technical discipline, essential to any engineer. That’s the reason why the Engineering School of the Autonomous University of Barcelona (UAB) has designed an introductory course entitled “Digital Systems: From Logic Gates to Processors,” available on the Coursera MOOC (Massive Open Online Course) platform. This book includes all the material presented in the above mentioned MOOC. Digital systems are constituted of electronic circuits made up (mainly) of transistors. A transistor is a very small device, similar to a simple switch. On the other hand, a digital component, like a microprocessor, is a very large circuit able to execute very complex operations. How can we build such a complex system (a microprocessor) using very simple building blocks v

vi

(the transistors)? The answer to this question is the central topic of a complete course on digital systems. This introductory course describes the basic methods used to develop digital systems, not only the traditional ones, based on the use of logic gates and flip-flops, but also more advanced techniques that permit to design very large circuits and are based on hardware description languages and simulation and synthesis tools. At the end of this course the reader: • Will have some idea of the way a new digital system can be developed, generally starting from a functional specification; in particular, she/he will be able to: – Design digital systems of medium complexity – Describe digital systems using a high-level hardware description language – Understand the operation of computers at their most basic level • Will know the main problems the development engineer is faced with, during the process of developing a new circuit • Will understand which design tools are necessary to develop a new circuit This course addresses (at least) two categories of people: on the one hand, people interested to know what a digital system is and how it can be developed and nothing else, but also people who need some knowledge about digital systems as a previous step toward other technical disciplines, such as computer architecture, robotics, bionics, avionics, and others.

Overview Chapter 1 gives a general definition of digital systems, presents generic description methods, and gives some information about the way digital systems can be implemented under the form of electronic circuits. Chapter 2 is devoted to combinational circuits, a particular type of digital circuit (memoryless circuit). Among others, it includes an introduction to Boolean algebra, one of the mathematical tools used to define the behavior of digital circuits. In Chap. 3, a particular type of circuit, namely, arithmetic circuits, is presented. Arithmetic circuits are present in almost any system so that they deserve some particular presentation. Furthermore, they constitute a first example of reusable blocks. Instead of developing systems from scratch, a common strategy in many technical disciplines is to reuse already developed parts. This modular approach is very common in software engineering and can also be considered in the case of digital circuits. As an example, think of building a multiplier using adders and one-digit multipliers. Sequential circuits, which are circuits including memory elements, are the topic of Chap. 4. Basic sequential components (flip-flops) and basic building blocks (registers, counters, memories) are defined. Synthesis methods are

Preface

Preface

vii

presented. In particular, the concept of finite state machines (FSM), a mathematical tool used to define the behavior of a sequential circuit, is introduced. As an example of the application of the synthesis methods described all along in the previous chapters, the design of a complete digital system is presented in Chap. 5. It is a generic system, able to execute a set of algorithms, depending on the contents of a memory block that stores a program. This type of system is called a processor, in this case a very simple one. The last two chapters are dedicated to more general considerations about design methods and tools (Chap. 6) and about physical implementations (Chap. 7). All along the course, a standard hardware description language, namely, VHDL, is used to describe circuits. A short introduction to VHDL is included in Appendix A. In order to define algorithms, a more informal and not executable language (pseudocode) is used. It is defined in Appendix B. Appendix C is an introduction to the binary numeration system used to represent numbers. Tarragona, Spain Bellaterra, Spain Barcelona, Spain

Jean-Pierre Deschamps Elena Valderrama Lluı´s Tere´s

Acknowledgments

The authors thank the people who have helped them in developing this book, especially Prof. Merce` Rulla´n who reviewed the text and is the author of Appendices B and C. They are grateful to the following institutions for providing them the means for carrying this work through to a successful conclusion: Autonomous University of Barcelona, National Center of Microelectronics (CSIC, Bellaterra, Spain), and University Rovira i Virgili (Tarragona, Spain).

ix

Contents

1

Digital Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Description Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Functional Description . . . . . . . . . . . . . . . . . . . . 1.2.2 Structural Description . . . . . . . . . . . . . . . . . . . . . 1.2.3 Hierarchical Description . . . . . . . . . . . . . . . . . . . 1.3 Digital Electronic Systems . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Real System Structure . . . . . . . . . . . . . . . . . . . . . 1.3.2 Electronic Components . . . . . . . . . . . . . . . . . . . . 1.3.3 Synthesis of Digital Electronic Systems . . . . . . . . 1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

1 1 4 4 7 8 10 10 11 18 18 20

2

Combinational Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Synthesis from a Table . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Boolean Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Some Additional Properties . . . . . . . . . . . . . . . . . 2.3.3 Boolean Functions and Truth Tables . . . . . . . . . . 2.3.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Logic Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 NAND and NOR . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 XOR and XNOR . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Tristate Buffers and Tristate Inverters . . . . . . . . . 2.5 Synthesis Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Redundant Terms . . . . . . . . . . . . . . . . . . . . . . . . 2.5.2 Cube Representation . . . . . . . . . . . . . . . . . . . . . . 2.5.3 Adjacency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.4 Karnaugh Map . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Propagation Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Other Logic Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.1 Multiplexers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.2 Multiplexers and Memory Blocks . . . . . . . . . . . . 2.7.3 Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.4 Address Decoder and Tristate Buffers . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

21 21 22 27 27 30 31 34 35 35 37 41 42 42 45 47 48 50 55 55 58 60 60 xi

xii

Contents

2.8

Programming Language Structures . . . . . . . . . . . . . . . . . 2.8.1 If Then Else . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.2 Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.3 Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.4 Procedure Calls . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

62 62 63 63 65 66 66 67

3

Arithmetic Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Binary Adder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Binary Subtractor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Binary Adder/Subtractor . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Binary Multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Binary Divider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

69 69 70 71 72 74 76 77

4

Sequential Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Introductory Example . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Explicit Functional Description . . . . . . . . . . . . . . . . . . . 4.3.1 State Transition Graph . . . . . . . . . . . . . . . . . . . . 4.3.2 Example of Explicit Description Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.3 Next State Table and Output Table . . . . . . . . . . . 4.4 Bistable Components . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 1-Bit Memory . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Latches and Flip-Flops . . . . . . . . . . . . . . . . . . . . 4.5 Synthesis Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Sequential Components . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.1 Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.2 Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.3 Memories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Sequential Implementation of Algorithms . . . . . . . . . . . . 4.7.1 A First Example . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.2 Combinational vs. Sequential Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 4.8 Finite-State Machines . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8.2 VHDL Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9 Examples of Finite-State Machines . . . . . . . . . . . . . . . . . 4.9.1 Programmable Timer . . . . . . . . . . . . . . . . . . . . . 4.9.2 Sequence Recognition . . . . . . . . . . . . . . . . . . . . . 4.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

79 79 80 83 83

. . . . . . . . . . . .

86 88 88 89 91 93 96 97 101 107 113 113

. . . . . . . . .

116 119 119 121 126 126 129 132 133

Contents

xiii

5

Synthesis of a Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 Design Strategy . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Functional Specification . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Instruction Types . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Structural Specification . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Component Specification . . . . . . . . . . . . . . . . . . 5.4 Component Implementation . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Input Selection Component . . . . . . . . . . . . . . . . . 5.4.2 Computation Resources . . . . . . . . . . . . . . . . . . . 5.4.3 Output Selection . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.4 Register Bank . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.5 Go To Component . . . . . . . . . . . . . . . . . . . . . . . 5.5 Complete Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 Instruction Encoding . . . . . . . . . . . . . . . . . . . . . . 5.5.2 Instruction Decoder . . . . . . . . . . . . . . . . . . . . . . 5.5.3 Complete Circuit . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

135 135 135 136 143 143 143 145 145 147 150 150 152 153 155 158 160 160 161 161 164 170

6

Design Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Structural Description . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 RTL Behavioral Description . . . . . . . . . . . . . . . . . . . . . 6.3 High-Level Synthesis Tools . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

171 171 172 175 177

7

Physical Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Manufacturing Technologies . . . . . . . . . . . . . . . . . . . . . 7.2 Implementation Strategies . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Standard Cell Approach . . . . . . . . . . . . . . . . . . . 7.2.2 Mask Programmable Gate Arrays . . . . . . . . . . . . 7.2.3 Field Programmable Gate Arrays . . . . . . . . . . . . . 7.3 Synthesis and Physical Implementation Tools . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

179 179 184 184 185 185 188 188

Appendix A:

A VHDL Overview . . . . . . . . . . . . . . . . . . . . . . . . 189

Appendix B:

Pseudocode Guidelines for the Description of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

Appendix C:

Binary Numeration System . . . . . . . . . . . . . . . . . . 227

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

About the Authors

Jean-Pierre Deschamps received an M.S. degree in electrical engineering from the University of Louvain, Belgium, in 1967; a Ph.D. in computer science from the Autonomous University of Barcelona, Spain, in 1983; and a Ph.D. degree in electrical engineering from the Polytechnic School of Lausanne, Switzerland, in 1984. He worked in several companies and universities. His research interests include ASIC and FPGA design and digital arithmetic. He is the author of ten books and more than a hundred international papers. Elena Valderrama received an M.S. degree in physics from the Autonomous University of Barcelona (UAB), Spain, in 1975, and a Ph.D. in 1979. Later, in 2006, she got a degree in medicine from the same university. She is currently professor at the Microelectronics Department of the Engineering School of UAB. From 1980 to 1998, she was an assigned researcher in the IMB-CNM (CSIC), where she led several biomedical-related projects in which the design and integration of highly complex digital systems (VLSI) was crucial. Her current interests focus primarily on education, not only from the point of view of the professor but also in the management and quality control of engineering-related educational programs. Her research interests move around the biomedical applications of microelectronics. Lluı´s Tere´s received an M.S. degree in 1982 and a Ph.D. in 1986, both in computer sciences, from the Autonomous University of Barcelona (UAB). He is working in UAB since 1982 and in IMB-CNM (CSIC) since its creation in 1985. He is head of the Integrated Circuits and Systems (ICAS) group at IMB with research activity in the fields of ASICs, sensor signal interfaces, body-implantable monitoring systems, integrated N/MEMS interfaces, flexible platform-based systems and SoC, and organic/printed microelectronics. He has participated in more than 60 industrial and research projects. He is coauthor of more than 70 papers and 8 patents. He has participated in two spin-offs. He is also a part time assistant professor at UAB.

xv

1

Digital Systems

This first chapter divides up into three sections. The first section defines the concept of digital system. For that, the more general concept of physical system is first defined. Then, the particular characteristics of digital physical systems are presented. In the second section, several methods of digital system specification are considered. A correct and unambiguous initial system specification is a key aspect of the development work. Finally, the third section is a brief introduction to digital electronics.

1.1

Definition

As a first step, the more general concept of physical system is introduced. It is not easy to give a complete and rigorous definition of physical system. Nevertheless, this expression has a rather clear intuitive meaning, and some of their more important characteristics can be underlined. A physical system could be defined as a set of interconnected objects or elements that realize some function and are characterized by a set of input signals, a set of output signals, and a relation between input and output signals. Furthermore, every signal is characterized by • Its type, for example a voltage, a pressure, a temperature, and a switch state • A range of values, for example all voltages between 0 and 1.5 V and all temperatures between 15 and 25 C Example 1.1 Consider the system of Fig. 1.1. It controls the working of a boiler that is part of a room heating system and is connected to a mechanical selector that permits to define a reference temperature. A temperature sensor measures the ambient temperature. Thus, the system has two input signals • pos: the selector position that defines the desired ambient temperature (any value between 10 and 30 ) • temp: the temperature measured by the sensor and one output signal • onoff, with two possible values ON (start the boiler) and OFF (stop the boiler).

# Springer International Publishing Switzerland 2017 J.-P. Deschamps et al., Digital Systems, DOI 10.1007/978-3-319-41198-9_1

1

2

1

Fig. 1.1 Temperature control

Digital Systems

20 10

30

pos

temperature control

onoff

to boiler

temp

temperature sensor

The relation between inputs and output is defined by the following program in which half_degree is a previously defined constant equal to 0.5. Algorithm 1.1 Temperature Control loop if temp < pos – half_degree then onoff ¼ on; elsif temp > pos + half_degree then onoff ¼ off; end if; wait for 10 s; end loop;

This is a pseudo-code program. An introduction to pseudo-code is given in Appendix B. However, this piece of program is quite easy to understand, even without any previous knowledge. Actually, the chosen pseudo-code is a simplified (non-executable) version of VHDL (Appendix A). Algorithm 1.1 is a loop whose body is executed every 10 s: the measured temperature temp is compared with the desired temperature pos defined by the mechanical selector position; then • If temp is smaller than pos 0.5, then the boiler must get started so that the output signal onoff ¼ ON. • If temp is greater than pos + 0.5, then the boiler must be stopped so that the output signal onoff ¼ OFF. • If temp is included between pos 0.5 and pos + 0.5, then no action is undertaken and the signal onoff value remains unchanged. This is a functional specification including some additional characteristics of the final system. For example: The temperature updating is performed every 10 s, so that the arithmetic operations must be executed in less than 10 s, and the accuracy of the control is about 0.5 . As mentioned above, the type and range of the input and output signals must be defined. • The input signal temp represents the ambient temperature measured by a sensor. Assume that the sensor is able to measure temperatures between 0 and 50 . Then temp is a signal whose type is “temperature” and whose range is “0 to 50 .” • The input signal pos is the position of a mechanical selector. Assume that it permits to choose any temperature between 10 and 30 . Then pos is a signal whose type is “position” and whose range is “10–30.” • The output signal onoff has only two possible values. Its type is “command” and its range is {ON, OFF}.

1.1

Definition

3

ref

reset start stop

h

HOURS

MINUTES

SECONDS

TENTHS

time m compu s -tation t

Fig. 1.2 Chronometer Fig. 1.3 Time reference signal

0.1 s ref

Assume now that the sensor is an ideal one, able to measure the temperature with an infinite accuracy, and that the selector is a continuous one, able to define the desired temperature with an infinite precision. Then both signals temp and pos are real numbers whose ranges are [0, 50] and [10, 30], respectively. Those signals, characterized by a continuous and infinite range of values, are called analog signals. On the contrary, the range of the output signal onoff is a finite set {ON, OFF}. Signals whose range is a finite set (not necessary binary as in the case of onoff) are called digital signals or discrete signals. Example 1.2 Figure 1.2 represents the structure of a chronometer. • Three push buttons control its working. They generate binary (2-valued) signals reset, start, and stop. • A crystal oscillator generates a time reference signal ref (Fig. 1.3): it is a square wave signal whose period is equal to 0.1 s (10 Hz). • A time computation system computes the value of signals h (hours), m (minutes), s (seconds), and t (tenths of second). • Some graphical interface displays the values of signals h, m, s, and t. Consider the time computation block. It is a physical system (a subsystem of the complete chronometer) whose input signals are • reset, start, and stop that are generated by three push buttons; according to the state of the corresponding switch, their value belongs to the set {closed, open}. • ref is the signal generated by the crystal oscillator and is assumed to be an ideal square wave equal to either 0 or 1 V and whose output signals are • h belonging to the set {0, 1, 2, . . ., 23} • m and s belonging to the set {0, 1, 2, . . ., 59} • t belonging to the set {0, 1, 2, . . ., 9}

4

1

Digital Systems

The relation between inputs and outputs can be defined as follows (in natural language): • When reset is pushed down then h ¼ m ¼ s ¼ t ¼ 0. • When start is pushed down, the chronometer starts counting; h, m, s, and t represent the elapsed time in tenth of seconds. • When stop is pushed down, the chronometer stops counting; h, m, s, and t represent the latest elapsed time. In this example, all input and output signal values belong to finite sets. So, according to a previous definition, all input and output signals are digital. Systems whose all input and output signals are digital are called digital system.

1.2

Description Methods

In this section several specification methods are presented.

1.2.1

Functional Description

The relation between inputs and outputs of a digital system can be defined in a functional way, without any information about the internal structure of the system. Furthermore, a distinction can be made between explicit and implicit functional descriptions. Example 1.3 Consider again the temperature controller of Example 1.1, with two modifications: • The desired temperature (pos) is assumed to be constant and equal to 20 (pos ¼ 20). • The measured temperature has been discretized so that the signal temp values belong to the set {0, 1, 2, . . ., 50}. Then, the working of the controller can be described, in a completely explicit way, by Table 1.1 that associates to each value of temp the corresponding value of onoff: if temp is smaller than 20, then onoff ¼ ON; if temp is greater than 20, then onoff ¼ ON; if temp is equal to 20, then onoff keeps unchanged. The same specification could be expressed by the following program. Table 1.1 Explicit specification

temp 0 1 18 19 20 21 22 49 50

onoff ON ON ON ON unchanged OFF OFF OFF OFF

1.2

Description Methods

5

Algorithm 1.2 Simplified Temperature Control if temp < 20 then onoff ¼ on; elsif temp > 20 then onoff ¼ off; end if;

This type of description, by means of an algorithm, will be called “implicit functional description.” In such a simple example, the difference between Table 1.1 and Algorithm 1.2 is only formal; in fact it is the same description. In more complex systems, a completely explicit description (a table) could be unmanageable. Example 1.4 As a second example of functional specification consider a system (Fig. 1.4) that adds two 2-digit numbers. Its input signals are • x1, x0, y1, and y0 whose values belong to {0, 1, 2, . . ., 9} and its output signals are • z2 whose values belong to {0, 1}, and z1 and z0 whose values belong to {0, 1, 2, . . ., 9}. Digits x1 and x0 represent a number X belonging to the set {0, 1, 2, . . . , 99}; digits y1 and y0 represent a number Y belonging to the same set {0, 1, 2, . . . , 99}, and digits z2, z1, and z0 represent a number Z belonging to the set {0, 1, 2, . . . ,198} where 198 ¼ 99 + 99 is the maximum value of X + Y. An explicit functional specification is Table 1.2 that contains 10,000 rows! Another way to specify the function of a 2-digit adder is the following algorithm in which symbol/ stands for the integer division. Fig. 1.4 2-Digit adder

x1 x0 y1 y0

Table 1.2 Explicit specification of a 2-digit adder

x1 x0 00 00 00 01 01 01 99 99 99

2-digit adder

y1 y0 00 01 99 00 01 99 00 01 99

z2 z1 z0

z2 z1 z0 000 001 099 001 002 100 099 100 198

6

1

Digital Systems

Algorithm 1.3 2-Digit Adder X ¼ 10x1 + x0; Y ¼ 10y1 + y0; Z ¼ X + Y; z2 ¼ Z/100; z1 ¼ (Z - 100z2)/10; z0 ¼ Z - 100z2 - 10z1;

As an example, if x1 ¼ 5, x0 ¼ 7, y1 ¼ 7, and y0 ¼ 1, then X ¼ 105 + 7 ¼ 57. Y ¼ 10 7 + 1 ¼ 71. Z ¼ 57 + 71 ¼ 128. z2 ¼ 128/100 ¼ 1. z2 ¼ (128 100 1)/10 ¼ 28/10 ¼ 2. z3 ¼ 128 100 1 10.2 ¼ 8. At the end of the algorithm execution: X þ Y ¼ Z ¼ 100 z2 þ 10 z1 þ z0 : Table 1.2 and Algorithm 1.3 are functional specifications. The first is explicit, the second is implicit, and both are directly deduced from the initial unformal definition: x1 and x0 represent X, digits y1 and y0 represent Y, and z2, z1, and z0 represent Z ¼ X + Y. Another way to define the working of the 2-digit adder is to use the classical pencil and paper algorithm. Given two 2-digit numbers x1 x0 and y1 y0, • Compute s0 ¼ x1 + x0. • If s0 < 10 then z0 ¼ s0 and carry ¼ 0; in the contrary case (s0 10) then z0 ¼ s0 10 and carry ¼ 1. • Compute s1 ¼ y1 + y0 + carry. • If s1 < 10 then z1 ¼ s1 and z2 ¼ 0; in the contrary case (s1 10) then z1 ¼ s1 10 and z2 ¼ 1. Algorithm 1.4 Pencil and Paper Algorithm s0 ¼ x0 + y0; if s0 10 then z0 ¼ s0 ‐ 10; carry ¼ 1; else z0 ¼ s0; carry ¼ 0; end if; s1 ¼ x1 + y1 + carry; if s1 10 then z1 ¼ s1 ‐ 10; z2 ¼ 1; else z1 ¼ s1; z2 ¼ 0; end if;

As an example, if x1 ¼ 5, x0 ¼ 7, y1 ¼ 7, and y0 ¼ 1, then s0 ¼ 7 + 1 ¼ 8; s0 < 10 so that z0 ¼ 8; carry ¼ 0;

1.2

Description Methods

7 x

Fig. 1.5 1-Digit adder

carryOUT

y

1-digit adder

carryIN

z

s1 ¼ 5 + 7 + 0 ¼ 12; s1 10 so that z1 ¼ 12 10 ¼ 2; z2 ¼ 1; and thus 57 + 71 ¼ 128. Comment 1.1 Algorithm 1.4 is another implicit functional specification. However it is not directly deduced from the initial unformal definition as was the case of Table 1.2 and of Algorithm 1.3. It includes a particular step-by-step addition method and, to some extent, already gives some indication about the structure of the system (the subject of next Sect. 1.2.2). Furthermore, it could easily be generalized to the case of n-digit operands for any n > 2.

1.2.2

Structural Description

Another way to specify the relation between inputs and outputs of a digital system is to define its internal structure. For that, a set of previously defined and reusable subsystems called components must be available. Example 1.5 Assume that a component called 1-digit adder (Fig. 1.5) has been previously defined. Its input signals are • Digits x and y belonging to {0, 1, 2, . . ., 9} • carryIN 2 {0, 1} and its output signals are • z 2 {0, 1, 2, . . ., 9} • carryOUT 2 {0, 1} Every 1-digit adder component executes the operations that correspond to a particular step of the pencil and paper addition method (Algorithm 1.4): • Add two digits and an incoming carry. • If the obtained sum is greater than or equal to 10, subtract 10 and the outgoing carry is 1; in the contrary case the outgoing carry is 0. The following algorithm specifies its working.

8

1 x3

Fig. 1.6 4-Digit adder

z4

y3

x2

y2

x1

y1

Digital Systems x0

y0

1-digit adder

1-digit adder

1-digit adder

1-digit adder

z3

z2

z1

z0

0

Algorithm 1.5 1-Digit Adder s ¼ x + y + carryIN; if s 10 then z ¼ s ‐ 10; carryOUT ¼ 1; else z ¼ s; carryOUT ¼ 0; end if;

With this component, the structure of a 4-digit adder can be defined (Fig. 1.6). It computes the sum Z ¼ X + Y where X ¼ x3 x2 x1 x0 and Y ¼ y3 y2 y1 y0 are two 4-digit numbers and Z ¼ z4 z3 z2 z1 z0 is a 5-digit number whose most significant digit z4 is 0 or 1 (X + Y 9999 + 9999 ¼ 19,998). Comment 1.2 In the previous Example 1.5, four identical components (1-digit adders) are used to define a 4-digit adder by means of its structure (Fig. 1.6). The 1-digit adder in turn has been defined by its function (Algorithm 1.5). This is an example of 2-level hierarchical description. The first level is a diagram that describes the structure of the system, while the second level is the functional description of the components.

1.2.3

Hierarchical Description

Hierarchical descriptions with more than two levels can be considered. The following example describes a 3-level hierarchical description. Example 1.6 Consider a system that computes the sum z ¼ w + x + y where w, x, and y are 4-digit numbers. The maximum value of z is 9999 + 9999 + 9999 ¼ 29,997 that is a 5-digit number whose most significant digit is equal to 0, 1, or 2. The first hierarchical level (top level) is a block diagram with two different blocks (Fig. 1.7): a 4-digit adder and a 5-digit adder. The 4-digit adder can be divided into four 1-digit adders (Fig. 1.8) and the 5-digit adder can be divided into five 1-digit adders (Fig. 1.9). Figures 1.8 and 1.9 constitute a second hierarchical level. Finally, a 1-digit adder (Fig. 1.5) can be defined by its functional description (Algorithm 1.5). It constitutes a third hierarchical level (bottom level). Thus, the description of the system that computes z consists of three levels (Fig. 1.10). The lowest level is the functional description of a 1-digit adder. Assuming that 1-digit adder components are available, the system can be built with nine components. A hierarchical description could be defined as follows. • It is a set of interconnected blocks. • Every block, in turn, is described either by its function or by a set of interconnected blocks, and so on. • The final blocks correspond to available components defined by their function.

w

Fig. 1.7 Top level

x

4

y

4

4 4-digit adder 5 u

5-digit adder 5 z x3

Fig. 1.8 4-Digit adder

x2

y2

x1

y1

x0

y0

1-digit adder

1-digit adder

1-digit adder

1-digit adder

z3

z2

z1

z0

u4

w3 u3

w2 u 2

w1 u1

w0 u0

1-digit adder

1-digit adder

1-digit adder

1-digit adder

1-digit adder

z4

z3

z2

z1

z0

z4

Fig. 1.9 5-Digit adder

y3

0

w x

y

4 4

4

5 u

5 z 0 u4 x co

z

y ci

z4

w3 u3

w2 u2

w1 u1

w0 u0

x co

x co

x co

x co

z

y ci

z3

z

y ci

z2

z

y ci

z1

z

y ci

z0

0

u4

x3 y3

x2 y2

x1 y1

x0 y0

x co

x co

x co

x co

z

y ci

u3

s = x + y + ci; if s ≥ 10 then z = s–10; co = 1; else z = s; ci = 0; end if;

Fig. 1.10 Hierarchical description

z

y ci

u2

z

y ci

u1

z

y ci

u0

0

0

0

10

1

Digital Systems

Comments 1.3 Generally, the initial specification of a digital system is functional (a description of what the system does). In the case of very simple systems it could be a table that defines the output signal values in function of the input signal values. However, for more complex systems other specification methods should be used. A natural language description (e.g., in English) is a frequent option. Nevertheless, an algorithmic description (programing language, hardware description language, pseudo-code) could be a better choice: those languages have a more precise and unambiguous semantics than natural languages. Furthermore, programing language and hardware description language specifications can be compiled and executed, so that the initial specification can be tested. The use of algorithms to define the function of digital systems is one of the key aspects of this course. In other cases, the initial specification already gives some information about the way the system must be implemented (see Examples 1.5 and 1.6). In fact, the digital system designer work is the generation of a circuit made up of available components and whose behavior corresponds to the initial specification. Many times this work consists of successive refinements of an initial description: starting from an initial specification a (top level) block diagram is generated; then, every block is treated as a subsystem to which a more detailed block diagram is associated, and so on. The design work ends when all block diagrams are made up of interconnected components defined by their function and belonging to some available library of physical components (Chap. 7).

1.3

Digital Electronic Systems

The definition of digital system of Sect. 1.1 is a very general one and refers to any type of physical system whose input and output values belong to a finite set. In what follows, this course will focus on electronic systems.

1.3.1

Real System Structure

Most real digital systems include (Fig. 1.11) • Input devices such as sensors, keyboards, microphones, and communication receivers. • Output devices such as displays, motors, communication transmitters, and loudspeakers. Fig. 1.11 Structure of a real digital system

discrete electrical signals

keyboard

conversion

switches

conversion

sensors

conversion

receiver

conversion

···

···

digital electronic system

···

conversion

motor

conversion

display

conversion

transm.

···

···

1.3

Digital Electronic Systems

11

• Input converters that translate the information generated by the input devices to discrete electrical signals. • Output converters that translate discrete electrical signals into signals able to control the output devices. • A digital electronic circuit—the brain of the system—that generates output electrical data in function of the input electrical data. In Example 1.2, the input devices are three switches (push buttons) and a crystal oscillator, and the output device is a 7-digit display. The time computation block is an electronic circuit that constitutes the brain of the complete system. Thus, real systems consist of a set of input and output interfaces that connect the input and output devices to the kernel of the system. The kernel of the system is a digital electronic system whose input and output signals are discrete electrical signals. In most cases those input and output signals are binary encoded data. As an example, numbers can be encoded according to the binary numeration system and characters such as letters, digits, or some symbols can be encoded according to the standard ASCII codes (American Standard Code for Information Interchange).

1.3.2

Electronic Components

To build digital electronic systems, electronic components are used. In this section some basic information about digital electronic components is given. Much more complete and detailed information about digital electronics can be found in books such as Weste and Harris (2010) or Rabaey et al. (2003).

1.3.2.1 Binary Codification A first question: It has been mentioned above that, in most cases, the input and output signals are binary encoded data; but how are the binary digits (bits) 0 and 1 physically (electrically) represented? The usual solution consists in defining a low voltage VL, and a high voltage VH, and conventionally associating VL to bit 0 and VH to bit 1. The value of VL and VH depends on the implementation technology. In this section it is assumed that VL ¼ 0 V and VH ¼ 1 V. 1.3.2.2 MOS Transistors Nowadays, most digital circuits are made up of interconnected MOS transistors. They are very small devices and large integrated circuits contain millions of transistors. MOS transistors (Fig. 1.12a, b) have three terminals called S (source), D (drain), and G (gate). There are two types of transistors: n-type (Fig. 1.12a) and p-type (Fig. 1.12b) where n and p refer to the type of majority electrical charges (carriers) that can flow from terminal S (source) to terminal D (drain) under the control of the gate voltage: in an nMOS transistor the majority carriers are Fig. 1.12 MOS transistors

G

S

G

D

a.

S

D

b.

12

1

Digital Systems

electrons (negative charges) so that the current flows from D to S; in a pMOS transistor the majority carriers are holes (positive charges) so that the current flows from S to D. A very simplified model (Fig. 1.13) is now used to describe the working of an nMOS transistor: it works like a switch controlled by the transistor gate voltage. If the gate voltage VG is low (0 V) then the switch is open (Fig. 1.14a, b) and no current could flow. If the gate voltage VG is high (1 V) then the switch is closed (Fig. 1.14c, d) and VOUT tends to be equal to VIN. However, if VIN is high (1 V) then VOUT is not equal to 1 V (Fig. 1.14b). The maximum value of VOUT is VG VT where the threshold voltage VT is a characteristic of the implementation technology. It could be said that an nMOS transistor is a good switch for transmitting VL (Fig. 1.14c), but not a good switch for transmitting VH (Fig. 1.14d). A similar model can be used to describe the working of a pMOS transistor. If the gate voltage VG is high (1 V) then the switch is open (Fig. 1.15a, b) and no current could flow. If the gate voltage VG is low (0 V) then the switch is closed (Fig. 1.15c, d) and VOUT tends to be equal to VIN. However, if VIN is low (0 V) then VOUT is not equal to 0 V (Fig. 1.15b). Actually the minimum value of VOUT is VG + |VT| where the threshold voltage VT is a characteristic of the implementation technology. It could be said that a pMOS transistor is a good switch for transmitting VH (Fig. 1.15c), but not a good switch for transmitting VL (Fig. 1.15d).

VG

Fig. 1.13 Equivalent model

VG

~ ~ VIN

Fig. 1.14 nMOS switches

VOUT

VIN

VOUT

0V

1V open circuit

0V

a.

0V 0V

0V

1V

1V

1V

b.

Fig. 1.15 pMOS switches

c.

open circuit

d.

0V

1V 1V

1V

a.

open circuit

1V

c. 0V

1V open circuit 0V

< 1 V!

b.

> 0 V! 0V

d.

1.3

Digital Electronic Systems

13

1.3.2.3 CMOS Inverter By interconnecting several transistors, small components called logic gates can be implemented. The simplest one (Fig. 1.16) is the CMOS inverter, also called NOT gate. A CMOS inverter consists of two transistors: • A pMOS transistor whose source is connected to the high voltage VH (1 V), whose gate is connected to the circuit input and whose drain is connected to the circuit output. • An nMOS transistor whose source is connected to the low voltage VL (0 V), whose gate is connected to the circuit input and whose drain is connected to the circuit output. To analyze the working of this circuit in the case of binary signals, consider the two following input values: • If VIN ¼ 0 V then (Fig. 1.17a) according to the simplified model of Sect. 1.3.2.2, the nMOS transistor is equivalent to an open switch and the pMOS transistor is equivalent to a closed switch (a good switch for transmitting VH) so that VOUT ¼ 1 V. • If VIN ¼ 1 V then (Fig. 1.17b) the pMOS transistor is equivalent to an open switch and the nMOS transistor is equivalent to a closed switch (a good switch for transmitting VL) so that VOUT ¼ 0 V.

Fig. 1.16 CMOS inverter

1V

VIN

VOUT

0V

Fig. 1.17 Working of a CMOS inverter

1V

1V

0V

0V

1V

a. 0V

b.

14 Fig. 1.18 Inverter: behavior and logic symbol

1 IN

OUT

0

1

1

Digital Systems

IN

0

OUT

b.

a.

1V

Fig. 1.19 2-Input NAND gate (NAND2 gate) VIN1

VIN2

VOUT VIN1

VIN2

0V

The conclusion of this analysis is that, as long as only binary signals are considered, the circuit of Fig. 1.16 inverts the input signal: it transforms VL (0 V) into VH (1 V) and VH (1 V) into VL (0 V). In terms of bits, it transforms 0 into 1 and 1 into 0 (Fig. 1.18a). As long as only the logic behavior is considered (the relation between input bits and output bits), the standard inverter symbol of Fig. 1.18b is used.

1.3.2.4 Other Components With four transistors (Fig. 1.19) a 2-input circuit called NAND gate can be implemented. It works as follows: • If VIN1 ¼ VIN2 ¼ 1 V then both pMOS switches are open and both nMOS switches are closed so that they transmit VL ¼ 0 V to the gate output (Fig. 1.20a). • If VIN2 ¼ 0 V, whatever the value of VIN1, then at least one of the nMOS switches (connected in series) is open and at least one of the pMOS switches (connected in parallel) is closed, so that VH ¼ 1 V is transmitted to the gate output (Fig. 1.20b). • If VIN1 ¼ 0 V, whatever the value of VIN2, the conclusion is the same. Thus, the logic behavior of a 2-input NAND gate is given in Fig. 1.21a and the corresponding symbol is shown in Fig. 1.21b. The output of a 2-input NAND gate (NAND2) is equal to 0 if, and only if, both inputs are equal to 1. In all other cases the output is equal to 1.

1.3

Digital Electronic Systems

15

Fig. 1.20 NAND gate working

0V 1V

1V VIN1

0V

1V

1V

a. b. 0V

IN1 IN2

Fig. 1.21 2-Input NAND gate: behavior and symbol

OUT

0

0

1

0

1

1

1

0

1

1

1

0

IN1

OUT

IN2

b.

a.

1V

VIN1 IN1 IN 2

VIN2 VOUT VIN1

VIN2

OUT

0

0

1

0

1

0

1

0

0

1

1

0

IN1 IN2

OUT

c.

b. 0V

a. Fig. 1.22 NOR2 gate

Other logic gates can be defined and used as basic components of digital circuits. Some of them will now be mentioned. Much more complete information about logic gates can be found in classical books such as Floyd (2014) or Mano and Ciletti (2012). The circuit of Fig. 1.22a is a 2-input NOR gate (NOR2 gate). If VIN1 ¼ VIN2 ¼ 0 V, then both p-type switches are closed and both n-type switches are open, so that VH ¼ 1 V is transmitted to the gate output. In all other cases at least one of the p-type switches is open and at least one of the n-type

16

1

Digital Systems

switches is closed, so that VL ¼ 0 V is transmitted to the gate output. The logic behavior and the symbol of a NOR2 gate are shown in Fig. 1.22b, c. NAND and NOR gates with more than two inputs can be defined. The output of a k-input NAND gate is equal to 0 if, and only if, the k inputs are equal to 1. The corresponding circuit (similar to Fig. 1.19) has k p-type transistors in parallel and k n-type transistors in series. The output of a k-input NOR gate is equal to 1 if, and only if, the k inputs are equal to 0. The corresponding circuit (similar to Fig. 1.22) has k n-type transistors in parallel and k p-type transistors in series. The symbol of a 3-input NAND gate (NAND3 gate) is shown in Fig. 1.23a and the symbol of a 3-input NOR gate (NOR3 gate) is shown in Fig. 1.23b. The logic circuit of Fig. 1.24a consists of a NAND2 gate and an inverter. The output is equal to 1 if, and only if, both inputs are equal to 1 (Fig. 1.24b). It is a 2-input AND gate (AND2 gate) whose symbol is shown in Fig. 1.24c. The logic circuit of Fig. 1.25a consists of a NOR2 gate and an inverter. The output is equal to 0 if, and only if, both inputs are equal to 0 (Fig. 1.25b). It is a 2-input OR gate (OR2 gate) whose symbol is shown in Fig. 1.25c. AND and OR gates with more than two inputs can be defined. The output of a k-input AND gate is equal to 1 if, and only if, the k inputs are equal to 1, and the output of a k-input OR gate is equal to 0 if, and only if, the k inputs are equal to 0. For example, an AND3 gate can be implemented with a NAND3 gate and an inverter (Fig. 1.26a). Its symbol is shown in Fig. 1.26b. An OR3 gate can be implemented with a NOR3 gate and an inverter (Fig. 1.26c). Its symbol is shown in Fig. 1.26d.

Fig. 1.23 NAND3 and NOR3

IN 1 IN 2 IN 3

IN1 IN2 IN3

OUT

a.

b.

IN1 IN2

Fig. 1.24 AND2 gate IN 1 IN 2

OUT

OUT

a.

OUT

0

0

0

0

1

0

1

0

0

1

1

1

IN1 IN2

OUT

c.

b.

IN1 IN2

Fig. 1.25 OR2 gate IN1 IN2

OUT

a.

OUT

0

0

0

0

1

1

1

0

1

1

1

1

b.

IN1 IN2

OUT

c.

1.3

Digital Electronic Systems

Fig. 1.26 AND3 and OR3 gates

17

IN 1 IN 2 IN 3

IN1 IN2 IN3

OUT

OUT

a.

b.

IN 1 IN 2 IN 3

IN1 I N2 IN3

OUT

OUT

c.

Fig. 1.27 Buffer

d.

IN

OUT

IN

a.

OUT

b.

Fig. 1.28 3-State buffer C C IN

OUT

IN

OUT

b. C

a. Buffers are another type of basic digital components. The circuit of Fig. 1.27a, made up of two inverters, generates an output signal equal to the input signal. Thus, it has no logic function; it is a power amplifier. Its symbol is shown in Fig. 1.27b. The circuit of Fig. 1.28a is a 3-state buffer. It consists of a buffer, an inverter, a pMOS transistor, and an nMOS transistor. It has two inputs IN and C (control) and an output OUT. If C ¼ 0, then both switches (n-type and p-type) are open, so that the output OUT is disconnected from the input IN (floating state or high impedance state). If C ¼ 1, then both switches are closed, so that the output OUT is connected to the input IN through a good ( p-type) switch if IN ¼ 1 and through a good (ntype) switch if IN ¼ 0. The 3-state buffer symbol is shown in Fig. 1.28b. Other small-size components such as multiplexers, encoders, decoders, latches, flip flops, and others will be defined in the next chapters. To conclude this section about digital components, an example of larger size component is given. Figure 1.29a is the symbol of a read-only memory (ROM) that stores four 3-bit words. Its behavior is specified in Fig. 1.29b: with two address bits x1 and x0 one of the four stored words is selected and can be read from outputs z2, z1, and z0. More generally, a ROM with N address bits and M output bits stores 2N M-bit words (in total M2N bits).

18

1

Fig. 1.29 12-bit read-only memory (3 22-bit ROM) x1 x0

0 1 1 0

1 1 0 1

0 1 0 0

z2

z1

z0

a.

Digital Systems

x1

x0

z2

z1

z0

0

0

0

1

0

0

1

1

1

1

1

0

1

0

0

1

1

0

1

0

b.

1.3.3

Synthesis of Digital Electronic Systems

The central topic of this course is the synthesis of digital electronic systems. The problem can be stated in the following way. • On the one hand, the system designer has the specification of a system to be developed. Several specification methods have been proposed in Sect. 1.2. • On the other hand, the system designer has a catalog of available electronic components such as logic gates, memories, and others, and might have access to previously developed and reusable subsystems. Some of the more common electronic components have been described in Sect. 1.3.2. The designer work is the definition of a digital system that fulfils the initial specification and uses building blocks that belong to the catalog of available components or are previously designed subsystems. In a more formal way it could be said that the designer work is the generation of a hierarchical description whose final blocks are electronic components or reusable electronic subsystems.

1.4

Exercises

1. The working of the chronometer of Example 1.2 can be specified by the following program in which the condition ref_positive_edge is assumed to be true on every positive edge of signal ref. loop if reset ¼ ON then h ¼ 0; m ¼ 0; s ¼ 0; t ¼ 0; elsif start ¼ ON then while stop ¼ OFF loop if ref_positive_edge ¼ TRUE then update(h, m, s, t); end if; end loop; end if; end loop;

The update procedure updates the values of h, m, s, and t every time that there is a positive edge on ref, that is to say every tenth of second. Generate a pseudo-code program that defines the update procedure.

1.4

Exercises

19

2. Given two numbers X and Y ¼ y3 103 + y2 102 + y1 10 + y0, the product P ¼ X Y can be expressed as P ¼ y0 X + y1 X 10 + y2X 102 + y3 X 103. Generate a pseudo-code program based on the preceding relation to compute P. 3. Given two numbers X and Y ¼ y3 103 + y2 102 + y1 10 + y0, the product P ¼ X Y can be expressed as P ¼ (((y3 X) 10 + y2 X) 10 + y1 X) 10 + y0 X. Generate a pseudo-code program based on the preceding relation to compute P. 4. Analyze the working of the following circuit and generate a 16-row table that defines VOUT in function of VIN1, VIN2, VIN3, and VIN4. 1V

VIN1

VIN3

VIN2

VIN4

VOUT VIN1

VIN2

VIN3

VIN4

0V

5. Analyze the working of the following circuit and generate an 8-row table that defines VOUT in function of VIN1, VIN2, and VIN3. 1V

VIN 1

VIN 2

VIN3

VOUT

VIN 1

VIN3

VIN 2

0V

20

1

Digital Systems

References Floyd TL (2014) Digital fundamentals. Prentice Hall, Upper Saddle River Mano MMR, Ciletti MD (2012) Digital design. Prentice Hall, Boston Rabaey JM, Chandrakasan A, Nikolic B (2003) Digital integrated circuits: a design perspective. Prentice Hall, Upper Saddle River Weste NHE, Harris DM (2010) CMOS VLSI design: a circuit and systems perspective. Pearson, Boston

2

Combinational Circuits

Given a digital electronic circuit specification and a set of available components, how can the designer translate this initial specification to a circuit? The answer is the central topic of this course. In this chapter, an answer is given in the particular case of the combinational circuits.

2.1

Definitions

A switching function is a binary function of binary variables. In other words, an n-variable switching function associates a binary value, 0 or 1, to any n-component binary vector. As an example, in Fig. 1.29, z2, z1, and z0 are three 2-variable switching functions. A digital circuit that implements a set of switching functions in such a way that at any time the output signal values only depend on the input signal values at the same moment is called a combinational circuit. The important point of this definition is “at the same moment.” A combinational circuit with n inputs and m outputs is shown in Fig. 2.1. It implements m switching functions f i : f0, 1gn ! f0, 1g, i ¼ 0, 1, . . . , m 1: To understand the condition “at the same moment” an example of circuit that is not combinational is now given. Example 1.2 (A Non-combinational Circuit) Consider the temperature controller of Example 1.3 defined by Table 1.1 and substitute ON by 1 and OFF by 0. The obtained Table 2.1 does not define a combinational circuit: the knowledge that the current temperature is 20 does not permit to decide whether the output signal must be 0 or 1. To decide, it is necessary to know the previous value of the temperature. In other words, this circuit must have some kind of memory. Example 2.2 Consider the 4-bit adder of Fig. 2.2. Input bits x3, x2, x1, and x0 represent an integer X in binary numeration (Appendix C); input bits y3, y2, y1, and y0 represent another integer Y, input bit ci is an incoming carry; and output bits z4, z3, z2, z1, and z0 represent an integer Z. The relation between inputs and outputs is

# Springer International Publishing Switzerland 2017 J.-P. Deschamps et al., Digital Systems, DOI 10.1007/978-3-319-41198-9_2

21

22

2 Combinational Circuits

Fig. 2.1 n-Input m-output combinational circuit

x0 x1

combinational ··· circuit

xn-1

Table 2.1 Specification of a non-combinational circuit

y0 =f0 (x0, x1, ···, xn-1) y1= f1 (x0, x1, ···, xn-1) ··· ym-1 = fm-1 (x0, x1, ···, xn-1)

Temp 0 1 ... 18 19 20 21 22 ... 49 50

Fig. 2.2 4-Bit adder

Onoff 1 1 ... 1 1 Unchanged 0 0 ... 0 0

x3 x 2 x 1 x 0

co = z4

y3 y 2 y 1 y 0

4-bit adder

ci

z 3 z 2 z 1 z0

Z ¼ X þ Y þ ci : Observe that X and Y are 4-bit integers included within the range of 0–15, so that the maximum value of Z is 15 þ 15 þ 1 ¼ 31 that is a 5-bit number. Output z4 could also be used as an outgoing carry co. In this example, the value of the output bits only depends on the value of the input bits at the same time; it is a combinational circuit.

2.2

Synthesis from a Table

A completely explicit specification of a 4-bit adder (Fig. 2.2) is a table that defines five switching functions z4, z3, z2, z1, and z0 of nine variables x3, x2, x1, x0, y3, y2, y1, y0, and ci (Table 2.2). A straightforward implementation method consists in storing the Table 2.2 contents in a read-only memory (Fig. 2.3). The address bits are the input signals x3, x2, x1, x0, y3, y2, y1, y0, and ci and the stored words define the value of the output signals z4, z3, z2, z1, and z0. As an example, if the address bits are 100111001, so that x3x2x1x0 ¼ 1001, y3y2y1y0 ¼ 1100, and ci ¼ 1, then X ¼ 9, Y ¼ 12, and Z ¼ 9 þ 12 þ 1 ¼ 22, and the stored word is 10110 that is the binary representation of 22. Obviously this is a universal synthesis method: it can be used to implement any combinational circuit. The generic circuit of Fig. 2.1 can be implemented by the ROM of Fig. 2.4. However, in many

2.2

Synthesis from a Table

Table 2.2 Explicit specification of a 4-bit adder

Fig. 2.3 ROM implementation of a 4-bit adder

23

x3x2x1x0 0000 0000 0000 0000 0000 0000 ... 1001 ... 1111 1111

y3y2y1y0 0000 0000 0001 0001 0010 0010 ... 1100 ... 1111 1111

ci 0 1 0 1 0 1 ... 1 ... 0 1

x3 x2 x1 x0 y3 y2 y1 y0 ci

z4z3z2z1z0 00000 00001 00001 00010 00010 00011 ... 10110 ... 11110 11111

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0 ··· 1 1 1 1 1 1

0 0 0 1 1 1

0 1 1 0 0 1

1 0 1 1

··· xn-1

fm-1 (x0, x1, ···, xn-1)

x0 x1

f1 (x0, x1, ···, xn-1)

Fig. 2.4 ROM implementation of a combinational circuit

f0 (x0, x1, ···, xn-1)

z4 z3 z2 z1 z0

··· y 0 y1

ym-1

cases this is a very inefficient implementation method: the ROM of Fig. 2.4 must store m 2n bits, generally a (too) big number. As an example the ROM of Fig. 2.3 stores 5 29 ¼ 2,560 bits. Instead of using a universal, but inefficient, synthesis method, a better option is to take advantage of the peculiarities of the system under development. In the case of the preceding Example 2.2 (Fig. 2.2), a first step is to divide the 4-bit adder into four 1-bit adders (Fig. 2.5). Each 1-bit adder is a combinational circuit that implements two switching functions z and d of three variables x, y, and c. Each 1-bit adder executes the operations that correspond to a particular step of the binary addition method (Appendix C). A completely explicit specification is given in Table 2.3.

24

2 Combinational Circuits

Fig. 2.5 Structure of a 4-bit adder

x3

y3

x2

x

z4

y 1-bit d c adder z

Fig. 2.6 ROM implementation of a 1-bit adder

x1

x d

z3

Table 2.3 Explicit specification of a 1-bit adder

y2

y 1-bit c adder z

y 1-bit c adder z

x0

x d

z2

x 0 0 0 0 1 1 1 1

y1

x d

z1

y 0 0 1 1 0 0 1 1

y0

y 1-bit c adder z

ci

z0

c 0 1 0 1 0 1 0 1

d 0 0 0 1 0 1 1 1

x y

c

z 0 1 1 0 1 0 0 1

0 0 0 1 0 1 1 1

0 1 1 0 1 0 0 1

d z

In this case, a ROM implementation could be considered (Fig. 2.6). This type of small ROM (eight 2-bit words in this example) is often called lookup table (LUT) and it is the method used in field programmable gate arrays (FPGA) to implement switching functions of a few variables (Chap. 7). Instead of a ROM, a table can also be implemented by means of logic gates (Sect. 1.3.2), for example AND gates, OR gates, and Inverters (or NOT gates). Remember that • The output of an n-input AND gate is equal to 1 if, and only if, its n inputs are equal to 1. • The output of an n-input OR gate is equal to 1 if, and only if, at least one of its n inputs is equal to 1. • The output of an inverter is equal to 1 if, and only if, its input is equal to 0. Define now a 3-input switching function p(x, y, c) as follows: p ¼ 1 if, and only if, x ¼ 1, y ¼ 0, and c ¼ 1 (Table 2.4).

2.2

Synthesis from a Table

25

Table 2.4 Explicit specification of p

x 0 0 0 0 1 1 1 1

y 0 0 1 1 0 0 1 1

Fig. 2.7 Implementation of p

Table 2.5 Explicit specification of p1, p2, p3, and p4

c 0 1 0 1 0 1 0 1

p 0 0 0 0 0 1 0 0

x y c

x 0 0 0 0 1 1 1 1

y 0 0 1 1 0 0 1 1

c 0 1 0 1 0 1 0 1

p1 0 0 0 1 0 0 0 0

p

p2 0 0 0 0 0 1 0 0

p3 0 0 0 0 0 0 1 0

p4 0 0 0 0 0 0 0 1

This function p is implemented by the circuit of Fig. 2.7: the output of the AND3 gate is equal to 1 if, and only if, x ¼ 1, c ¼ 1 and the inverter output is equal to 1, that is, if y ¼ 0. The function d of Table 2.3 can be defined as follows: d is equal to 1 if, and only if, one of the following conditions is true: x ¼ 0, y ¼ 1, c ¼ 1, x ¼ 1, y ¼ 0, c ¼ 1, x ¼ 1, y ¼ 1, c ¼ 0, x ¼ 1, y ¼ 1, c ¼ 1: The switching functions p1, p2, p3, and p4 can be associated to those conditions (Table 2.5). Actually, the function p of Table 2.4 is the function p2 of Table 2.5. Each function pi can be implemented in the same way as p (Fig. 2.7) as shown in Fig. 2.8. Finally, the function d can be defined as follows: d is equal to 1 if, and only if, one of the functions pi is equal to 1. The corresponding circuit is a simple OR4 gate (Fig. 2.9) and the complete circuit is shown in Fig. 2.10.

26 Fig. 2.8 Implementation of p1, p2, p3, and p4

2 Combinational Circuits

x y c

p1

x y c

p2

x y c

p3

x y c

p4

Fig. 2.9 Implementation of d

p1 p2 p3

d

p4

Fig. 2.10 Complete circuit

x y c x y c d x y c x y c

Fig. 2.11 Simplified circuit

x y c x y c d x y

Comment 2.1 The conditions implemented by functions p3 and p4 are x ¼ 1, y ¼ 1, c ¼ 0 and x ¼ 1, y ¼ 1, c ¼ 1, and can obviously be substituted by the simple condition x ¼ 1 and y ¼ 1, whatever the value of c. Thus, in Fig. 2.10 two of the four AND3 gates can be replaced by a single AND2 gate (Fig. 2.11). The synthesis with logic gates of z (Table 2.3) is left as an exercise. In conclusion, given a combinational circuit whose initial specification is a table, two possible options are:

2.3

Boolean Algebra

27

• To store the table contents in a ROM • To translate the table to a circuit made up of logic gates Furthermore, in the second case, some optimization must be considered: if the inverters are not taken into account, the circuit of Fig. 2.11 contains 4 logic gates and 11 logic gate inputs, while the circuit of Fig. 2.10 contains 5 logic gates and 16 logic gate inputs. In CMOS technology (Sect. 1.3.2) the number of transistors is equal to twice the number of gate inputs so that the latter could be used as a measure of the circuit complexity. A conclusion is that a tool that helps to minimize the number of gates and the number of gate inputs is necessary. It is the topic of the next section.

2.3

Boolean Algebra

Boolean algebra is a mathematical support used to specify and to implement switching functions. Only finite Boolean algebras are considered in this course.

2.3.1

Definition

A Boolean algebra B is a finite set over which two binary operations are defined: • The Boolean sum þ • The Boolean product Those operations must satisfy six rules (postulates). The Boolean sum and the Boolean product are internal operations: 8a and b 2 B: a þ b 2 B and a b 2 B:

ð2:1Þ

Actually, this postulate only emphasizes the fact that þ and are operations over B. The set B includes two particular (and different) elements 0 and 1 that satisfy the following conditions: 8a 2 B : a þ 0 ¼ a and a 1 ¼ a:

ð2:2Þ

In other words, 0 and 1 are neutral elements with respect to the sum (0) and with respect to the product (1). Every element of B has an inverse in B: 8 a 2 B, ∃ a 2 B such that a þ a ¼ 1 and a a ¼ 0:

ð2:3Þ

Both operations are commutative: 8a and b 2 B : a þ b ¼ b þ a and a b ¼ b a:

ð2:4Þ

Both operations are associative: 8a, b and c 2 B, a ðb cÞ ¼ ða bÞ c and a þ ðb þ cÞ ¼ ða þ bÞ þ c:

ð2:5Þ

28

2 Combinational Circuits

The product is distributive over the sum and the sum is distributive over the product: 8a, b and c 2 B, a ðb þ cÞ ¼ a b þ a c and a þ b c ¼ ða þ bÞ ða þ cÞ:

ð2:6Þ

Comment 2.2 Rules (2.1)–(2.6) constitute a set of symmetric postulates: given a rule, by interchanging sum and product, and 0 and 1, another rule is obtained: for example the fact that a þ 0 ¼ a implies that a 1 ¼ a, or the fact that a (b þ c) ¼ a b þ a c implies that a þ b c ¼ (a þ b) (a þ c). This property is called duality principle. The simplest example of Boolean algebra is the set B2 ¼ {0, 1} with the following operations (Table 2.6): • a þ b ¼ 1 if, and only if, a ¼ 1 or b ¼ 1, the OR function of a and b • a þ b ¼ 1 if, and only if, a ¼ 1 and b ¼ 1, the AND function of a and b • The inverse of a is 1 a It can easily be checked that all postulates are satisfied. As an example (Table 2.7) check that the product is distributive over the sum, that is, a (b þ c) ¼ a b þ a c. The relation between B2 and the logic gates defined in Sect. 1.3.2 is obvious: an AND gate implements a Boolean product, an OR gate implements a Boolean sum, and an inverter (NOT gate) implements an invert function (Fig. 2.12). Table 2.6 Operations over B2

a 0 0 1 1

aþb 0 1 1 1

b 0 1 0 1

ab 0 0 0 1

a¯ 1 1 0 0

Table 2.7 a (b þ c) ¼ a b þ a c a 0 0 0 0 1 1 1 1

b 0 0 1 1 0 0 1 1

Fig. 2.12 Logic gates and Boolean functions

c 0 1 0 1 0 1 0 1

bþc 0 1 1 1 0 1 1 1

a (b þ c) 0 0 0 0 0 1 1 1

a

ab 0 0 0 0 0 0 1 1

a+b

b

a b c

ac 0 0 0 0 0 1 0 1

a

a·b

b

a+b+c

a b c

abþbc 0 0 0 0 0 1 1 1

a

a

a·b·c

2.3

Boolean Algebra

29

a b

f

b

a

f

c

a c

a.

b.

Fig. 2.13 Two equivalent circuits

In fact, there is a direct relation between Boolean expressions and circuits. As an example, a consequence of the distributive property a b þ a c ¼ a (b þ c) is that the circuits of Fig. 2.13 implement the same switching function, say f. However, the circuit that corresponds to the Boolean expression a b þ a c (Fig. 2.13a) has three gates and six gate inputs while the other (Fig. 2.13b) includes only two gates and three gate inputs. This is a first (and simple) example of how Boolean algebra helps the designer to optimize circuits. Other finite Boolean algebras can be defined. Consider the set B2n ¼ {0, 1}n, that is, the set of all n 2 n-component binary vectors. It is a Boolean algebra in which product, sum, and inversion are component-wise operations: 8a ¼ ða0 ; a1 ; . . . ; an1 Þ and b ¼ ðb0 ; b1 ; . . . ; bn1 Þ 2 B2 n : a þ b ¼ ða0 þ b0 , a1 þ b1 , . . . , an1 þ bn1 Þ, a b ¼ ða0 b0 , a1 b1 , . . . , an1 bn1 Þ, a ¼ ða0 , a1 , . . . , an1 Þ: The neutral elements are 0 ¼ (0, 0, . . ., 0) and 1 ¼ (1, 1, . . ., 1). Another example is the set of all subsets of a finite set S. Given two subsets S1 and S2, their sum is S1 [ S2 (union), their product is S1 \ S2 (intersection), and the inverse of S1 is S\S1 (complement of S1 with respect to S). The neutral elements are the empty set Ø and S. If S has n elements the number of subsets of S is 2n. A third example, the most important within the context of this course, is the set of all n-variable switching functions. Given two switching functions f and g, functions f þ g, f g, and f are defined as follows: 8ðx0 ; x1 ; . . . ; xn1 Þ 2 B2 n : ðf þ gÞðx0 ; x1 ; . . . ; xn1 Þ ¼ f ðx0 ; x1 ; . . . ; xn1 Þ þ gðx0 ; x1 ; . . . ; xn1 Þ, ðf gÞðx0 ; x1 ; . . . ; xn1 Þ ¼ f ðx0 ; x1 ; . . . ; xn1 Þ gðx0 ; x1 ; . . . ; xn1 Þ, f ðx0 ; x1 ; . . . ; xn1 Þ ¼ f ðx0 , x1 , . . . , xn1 Þ: The neutral elements are the constant functions 0 and 1. Comment 2.3 Mathematicians have demonstrated that any finite Boolean algebra is isomorphic to B2m for some m > 0. In particular, the number of elements of any finite Boolean algebra is a power of 2. Consider the previous examples. 1. The set of subsets of a finite set S ¼ {s1, s2, . . ., sn} is a Boolean algebra isomorphic to B2n: associate to every subset S1 of S an n-component binary vector whose component i is equal to 1 if, and only if, si 2 S1, and check that the vectors that correspond to the union of two subsets, the

30

2 Combinational Circuits

intersection of two subsets, and the complement of a subset are obtained by executing the component-wise addition, the component-wise product, and the component-wise inversion of the associated n-component binary vectors. 2. The set of all n-variable switching functions is a Boolean algebra isomorphic to B2m with m ¼ 2n: associate a number i to each of the 2n elements of {0, 1}n (for example the natural number represented in binary numeration by this vector); then, associate to any n-variable switching function f a 2n-component vector whose component number i is the value of f at point i. In the case of functions d and z of Table 2.3, n ¼ 3, 2n ¼ 8, and the 8-component vectors that define d and z are (00010111) and (01101001), respectively.

2.3.2

Some Additional Properties

Apart from rules (2.1–2.6) several additional properties can be demonstrated and can be used to minimize Boolean expressions and to optimize the corresponding circuits. Properties 2.1 1. 0 ¼ 1 and 1 ¼ 0:

ð2:7Þ

2. Idempotence : 8a 2 B : a þ a ¼ a and a a ¼ a:

ð2:8Þ

3. 8a 2 B : a þ 1 ¼ 1 and a 0 ¼ 0:

ð2:9Þ

4. Inverse uniqueness : if a b ¼ 0, a þ b ¼ 1, a c ¼ 0 and a þ c ¼ 1 then b ¼ c:

ð2:10Þ

5. Involution : 8a 2 B : a ¼ a:

ð2:11Þ

6. Absorption law : 8a and b 2 B, a þ a b ¼ a and a ða þ bÞ ¼ a:

ð2:12Þ

7. 8a and b 2 B, a þ a b ¼ a þ b and a ða þ bÞ ¼ a b:

ð2:13Þ

8. de Morgan laws : 8a and b 2 B, a þ b ¼ a b and a b ¼ a þ b:

ð2:14Þ

9. Generalized de Morgan laws : 8a1 , a2 , . . . , an 2 B;

ð2:15Þ

a1 þ a2 þ . . . þ an ¼ a1 a2 . . . an and a1 a2 . . . an ¼ a1 þ a2 þ . . . þ an :

Proof 1. 0 ¼ 0 þ 0 ¼ 1 and 1 ¼ 1 1 ¼ 0: 2. a ¼ a þ 0 ¼ a þ ða aÞ ¼ ða þ aÞ ða þ aÞ ¼ ða þ aÞ 1 ¼ a þ a; a ¼ a 1 ¼ a ða þ aÞ ¼ ða aÞ þ ða aÞ ¼ ða aÞ þ 0 ¼ a a:

2.3

Boolean Algebra

31

3. a þ 1 ¼ a þ a þ a ¼ a þ a ¼ 1; a 0 ¼ a a a ¼ a a ¼ 0: 4. b ¼ b ða þ cÞ ¼ a b þ b c ¼ 0 þ b c ¼ a c þ b c ¼ ða þ bÞ c ¼ 1 c ¼ c: 5. Direct consequence of (4). 6. a þ a b ¼ a 1 þ a b ¼ a ð1 þ bÞ ¼ a 1 ¼ a; a ða þ bÞ ¼ a a þ a b ¼ a þ a b ¼ a: 7. a þ a b ¼ ða þ aÞ ða þ bÞ ¼ 1 ða þ bÞ ¼ a þ b; a ða þ bÞ ¼ ða aÞ þ ða bÞ ¼ 0 þ ða bÞ ¼ a b: 8. ða þ bÞ a b ¼ a a b þ b a b ¼ 0 b þ 0 a ¼ 0 þ 0 ¼ 0; ð a þ bÞ þ a b ¼ a þ b þ a b ¼ a þ b þ a ¼ b þ 1 ¼ 1 : 9. By induction.

2.3.3

Boolean Functions and Truth Tables

Tables such as Table 2.3 that defines two switching functions d and z are called truth tables. If f is an n-variable switching function then its truth table has 2n rows, that is, the number of different ncomponent vectors. In this section the relation between Boolean expressions, truth tables, and gate implementation of combinational circuits is analyzed. Given a Boolean expression, that is a well-constructed expression using variables and Boolean operations (sum, product, and inversion), a truth table can be defined. For that the value of the expression must be computed for every combination of variable values, in total 2n different combinations if there are n variables. Example 2.3 Consider the following Boolean expression that defines a 3-variable switching function f: f ða, b, cÞ ¼ b c þ a b: Define a table with as many rows as the number of combinations of values of a, b, and c, that is, 23 ¼ 8 rows, and compute the value of f that corresponds to each of them (Table 2.8).

Table 2.8 f ða, b, cÞ ¼ b c þ a b: abc 000 001 010 011 100 101 110 111

c 1 0 1 0 1 0 1 0

bc 0 0 1 0 0 0 1 0

a¯ 1 1 1 1 0 0 0 0

ab 0 0 1 1 0 0 0 0

f ¼bcþ ab 0 0 1 1 0 0 1 0

32

2 Combinational Circuits

Conversely, a Boolean expression can be associated to any truth table. For that, first define some new concepts. Definitions 2.1 1. A literal is a variable or the inverse of a variable. For example a, a, b, b, . . . are literals. 2. An n-variable minterm is a product of n literals such that each variable appears only once. For example, if n ¼ 3 then there are eight different minterms: m0 ¼ a b c, m1 ¼ a b c, m2 ¼ a b c, m3 ¼ a b c,

ð2:16Þ

m4 ¼ a b c, m5 ¼ a b c, m6 ¼ a b c, m7 ¼ a b c:

Their corresponding truth tables are shown in Table 2.9. Their main property is that to each minterm mi is associated one, and only one, combination of values of a, b, and c such that mi ¼ 1: m0 is equal to 1 if: and only if, abc ¼ 000, m1 is equal to 1 if: and only if, abc ¼ 001, m2 is equal to 1 if: and only if, abc ¼ 010, m3 is equal to 1 if: and only if, abc ¼ 011, m4 is equal to 1 if: and only if, abc ¼ 100, m5 is equal to 1 if: and only if, abc ¼ 101, m6 is equal to 1 if: and only if, abc ¼ 110, m7 is equal to 1 if: and only if, abc ¼ 111: In other words, mi ¼ 1 if, and only if, abc is equal to the binary representation of i. Consider now a 3-variable function f defined by its truth table (Table 2.10). From Table 2.9 it can be deduced that f ¼ m2 þ m3 þ m6, and thus (2.16) f ¼ a b c þ a b c þ a b c:

ð2:17Þ

More generally, the n-variable minterm mi(xn1, xn2, . . ., x0) is equal to 1 if, and only if, the value of xn1xn2 . . . x0 is the binary representation of i. Given a truth table that defines an n-variable switching function f(xn1, xn2, . . ., x0), this function is the sum of all minterms mi such that in1in2 . . . i0 is the binary representation of i and f(in1, in2, . . ., i0) ¼ 1. This type of representation of a switching function under the form of a Boolean sum of minterms (like (2.17)) is called canonical representation.

Table 2.9 3-Variable minterms abc 000 001 010 011 100 101 110 111

m0 1 0 0 0 0 0 0 0

m1 0 1 0 0 0 0 0 0

m2 0 0 1 0 0 0 0 0

m3 0 0 0 1 0 0 0 0

m4 0 0 0 0 1 0 0 0

m5 0 0 0 0 0 1 0 0

m6 0 0 0 0 0 0 1 0

m7 0 0 0 0 0 0 0 1

2.3

Boolean Algebra

33

Table 2.10 Truth table of f

Fig. 2.14 f ¼ a b c þ abcþ abc

abc 000 001 010 011 100 101 110 111

a b c

f 0 0 1 1 0 0 1 0

f

The relation between truth table and Boolean expression, namely canonical representation, has been established. From a Boolean expression, for example (2.17), a circuit made up of logical gates can be deduced (Fig. 2.14). Another example: the functions p1, p2, p3, and p4 of Table 2.5 are minterms of variables x, y, and c, and the circuit of Fig. 2.10 corresponds to the canonical representation of d. Assume that a combinational system has been specified by some functional description, for example an algorithm (an implicit functional description). The following steps permit to generate a logic circuit that implements the function. • Translate the algorithm to a table (an explicit functional description); for that, execute the algorithm for all combinations of the input variable values. • Generate the canonical representation that corresponds to the table. • Optimize the expression using properties of the Boolean algebras. • Generate the corresponding circuit made up of logic gates. As an example, consider the following algorithm that defines a 3-variable switching function. Algorithm 2.1 Specification of f(a, b, c)

if (a¼1 and b¼1 and c¼0) or (a¼0 and b¼1) then f ¼ 1; else f ¼ 0; end if;

34

2 Combinational Circuits

Fig. 2.15 Optimized circuit

a f

b

c

Fig. 2.16 Implementation of (2.20)

y c

d

x

By executing this algorithm for each of the eight combinations of values of a, b, and c, Table 2.10 is obtained. The corresponding canonical expression is (2.17). This expression can be simplified using Boolean algebra properties: a b c þ a b c þ a b c ¼ a b ðc þ cÞ þ ða þ aÞ b c ¼ a b þ b c: The corresponding circuit is shown in Fig. 2.15. It implements the same function as the circuit of Fig. 2.14, with fewer gates and fewer gate inputs. This is an example of the kind of circuit optimization that Boolean algebras permit to execute.

2.3.4

Example

The 4-bit adder of Sect. 2.2 is now revisited and completed. A first step is to divide the 4-bit adder into four 1-bit adders (Fig. 2.5). Each 1-bit adder implements two switching functions d and z defined by their truth tables (Table 2.3). The canonical expressions that correspond to the truth tables of d and z are the following: d ¼ x y c þ x y c þ x y c þ x y c;

ð2:18Þ

z ¼ x y c þ x y c þ x y c þ x y c:

ð2:19Þ

The next step is to optimize the Boolean expressions. Equation 2.18 can be optimized as follows: d ¼ ðx þ xÞ y c þ x ðy þ yÞ c þ x y ðc þ cÞ ¼ y c þ x c þ x y:

ð2:20Þ

2.4

Logic Gates

35

Fig. 2.17 Implementation of (2.19)

x

y

c

z

The corresponding circuit is shown in Fig. 2.16. It implements the same function d as the circuit of Fig. 2.11, with fewer gates, fewer gate inputs, and without inverters. Equation 2.19 cannot be simplified. The corresponding circuit is shown in Fig. 2.17.

2.4

Logic Gates

In Sects. 2.2 and 2.3 a first approach to the implementation of switching functions has been proposed. It is based on the translation of the initial specification to Boolean expressions. Then, circuits made up of AND gates, OR gates, and inverters can easily be defined. However, there exist other components (Sect. 1.3.2) that can be considered to implement switching functions.

2.4.1

NAND and NOR

NAND gates and NOR gates have been defined in Sect. 1.3.2. They can be considered as simple extensions of the CMOS inverter and are relatively easy to implement in CMOS technology. A NAND gate is equivalent to an AND gate and an inverter, and a NOR gate is equivalent to an OR gate and an inverter (Fig. 2.18). The truth tables of a 2-input NAND function and of a 2-input NOR function are shown in Figs. 1.21a and 1.22b, respectively. More generally, the output of a k-input NAND gate is equal to 0 if, and only if, the k inputs are equal to 1, and the output of a k-input NOR gate is equal to 1 if, and only if, the k inputs are equal to 0. Thus, NANDðx1 , x2 , . . . , xn Þ ¼ x1 x2 . . . xn ¼ x1 þ x2 þ . . . þ xn ;

ð2:21Þ

NORðx1 , x2 , . . . , xn Þ ¼ x1 þ x2 þ . . . þ xn ¼ x1 x2 . . . xn :

ð2:22Þ

Sometimes, the following algebraic symbols are used:

36

2 Combinational Circuits

Fig. 2.18 NAND2 and NOR2 symbols and equivalent circuits

a

NAND(a, b)

b a

NOR(a, b)

b

Fig. 2.19 NOT, AND2, and OR2 gates implemented with NAND2 gates and inverters

a 1

a

a

a

NAND(a, b)

b a

NOR(a, b)

b

a

a

b

a·b

a a·b = a+b b

a " b ¼ NANDða; bÞ and a # b ¼ NORða; bÞ: NAND and NOR gates are universal modules. That means that any switching function can be implemented only with NAND gates or only with NOR gates. It has been seen in Sect. 2.3 that any switching function can be implemented with AND gates, OR gate, and inverters (NOT gates). To demonstrate that NAND gates are universal modules, it is sufficient to observe that the AND function, the OR function, and the inversion can be implemented with NAND functions. According to (2.21) x1 x2 . . . xn ¼ x1 x2 . . . xn ¼ NANDðx1 , x2 , . . . , xn Þ,

ð2:23Þ

x1 þ x2 þ . . . þ xn ¼ NANDðx1 , x2 , . . . , xn Þ;

ð2:24Þ

x ¼ x 1 ¼ NANDðx; 1Þ ¼ x x ¼ NANDðx; xÞ:

ð2:25Þ

As an example NOT, AND2, and NOR2 gates implemented with NAND2 gates are shown in Fig. 2.19. Similarly, to demonstrate that NOR gates are universal modules, it is sufficient to observe that the AND function, the OR function, and the inversion can be implemented with NOR functions. According to (2.22) x1 þ x2 þ . . . þ xn ¼ x1 þ x2 þ . . . þ xn ¼ NORðx1 , x2 , . . . , xn Þ,

ð2:26Þ

x1 x2 . . . xn ¼ NORðx1 , x2 , . . . , xn Þ;

ð2:27Þ

x ¼ x þ 0 ¼ NORðx; 0Þ ¼ x þ x ¼ NORðx; xÞ:

ð2:28Þ

Example 2.4 Consider the circuit of Fig. 2.11. According to (2.23) and (2.24), the AND gates and the OR gate can be substituted by NAND gates. The result is shown in Fig. 2.20a. Furthermore, two serially connected inverters can be substituted by a simple connection (Fig. 2.20b). Comments 2.4 1. Neither the 2-variable NAND function (NAND2) nor the 2-variable NOR function (NOR2) are associative operations. For example

2.4

Logic Gates

37

Fig. 2.20 Circuits equivalent to Fig. 2.11

x y ci x

a. y

ci c0

x y

x y ci x

b. y

ci c0

x y

Fig. 2.21 XOR gate and XNOR gate symbols

a b

XOR(a, b)

a b

XNOR(a, b)

NANDða, NANDðb; cÞÞ ¼ a þ NANDðb; cÞ ¼ a þ b c, NANDðNANDða; bÞ, cÞ ¼ a b þ c; and none of the previous functions is equal to NANDða; b; cÞ ¼ a þ b þ c. 2. As already mentioned above, NAND gates and NOR gates are easy to implement in CMOS technology. On the contrary, AND gates and OR gates must be implemented by connecting a NAND gate and an inverter or a NOR gate and an inverter, respectively. Thus, within a CMOS integrated circuit, NAND gates and NOR gates use less silicon area than AND gates and OR gates.

2.4.2

XOR and XNOR

XOR gates, where XOR stands for eXclusive OR, and XNOR gates are other commonly used components, especially in arithmetic circuits. The 2-variable XOR switching function is defined as follows:

38

2 Combinational Circuits

Table 2.11 XOR and XNOR truth tables

Fig. 2.22 3-Input and 4-input XOR gates and XNOR gates

a b c

ab 00 01 10 11

XOR(a, b) 0 1 1 0

a b c

XOR(a, b, c)

a

a

b

b

XOR(a, b, c, d)

c

a b

XNOR(a, b, c)

XNOR(a, b, c, d)

c

d

XNOR(a, b) 1 0 0 1

d

XOR(a, b, c, d)

a b

XNOR(a, b, c, d)

c d

c d

a.

b.

Fig. 2.23 4-Input XOR and XNOR gates implemented with 2-input gates

XORða; bÞ ¼ 1 if, and only if, a 6¼ b; and the 2-variable XNOR switching function is the inverse of the XOR function, so that XNORða; bÞ ¼ 1 if, and only if, a ¼ b: Their symbols are shown in Fig. 2.21 and their truth tables are defined in Table 2.11. The following algebraic symbols are used: a b ¼ XORða; bÞ, ab ¼ XNORða; bÞ: An equivalent definition of the XOR function is XORða; bÞ ¼ ða þ bÞmod2 ¼ a b: With this equivalent definition an n-variable XOR switching function can be defined for any n > 2: XORða1 ; a2 ; . . . ; an Þ ¼ ða1 þ a2 þ . . . þ an Þmod2 ¼ a1 a2 . . . an ; and the n-variable XNOR switching function is the inverse of the XOR function: XNORða1 ; a2 ; . . . ; an Þ ¼ XORða1 , a2 , . . . , an Þ: Examples of XOR gate and XNOR gate symbols are shown in Fig. 2.22. Mod 2 sum is an associative operation, so that n-input XOR gates can be implemented with 2-input XOR gates. As an example, in Fig. 2.23a a 4-input XOR gate is implemented with three 2-input XOR gates.

2.4

Logic Gates

39

An n-input XNOR gate is implemented by the same circuit as an n-input XOR gate in which the XOR gate that generates the output is substituted by an XNOR gate. In Fig. 2.23b a 4-input XNOR gate is implemented with two 2-input XOR gates and a 2-input XNOR gate. XOR gates and XNOR gates are not universal modules. However they are very useful to implement arithmetic functions. Example 2.5 As a first example consider a 4-bit magnitude comparator: given two 4-bit numbers a ¼ a3a2a1a0 and b ¼ b3b2b1b0 generate a switching function comp equal to 1 if, and only if, a ¼ b. The following trivial algorithm is used:

if (a3 ¼ b3) and (a2 ¼ b2) and (a1 ¼ b1) and (a0 ¼ b0) then comp ¼ 1; else comp ¼ 0; end if;

The corresponding circuit is shown in Fig. 2.24: comp ¼ 1 if, and only if, the four inputs of the NOR4 gate are equal to 0, that is, if ai ¼ bi and thus XOR(ai, bi) ¼ 0, 8i ¼ 0, 1, 2, and 3.

Fig. 2.24 4-Bit magnitude comparator

a3 b3 comp a2 b2 a1 b1 a0 b0

transmission d0 d1 d2 d3 d4 d5 d6 d7 d8

data source

8-bit parity bit generator

Fig. 2.25 Transmission of 8-bit data

data destination

9-bit parity bit generator

error

40

2 Combinational Circuits d0 d1 d2 d3

d0 d1

d8

d4 d5

d4 d5

d6 d7

d6 d7

error

d8

d2 d3

Fig. 2.26 Parity bit generation and parity check

Fig. 2.27 1-Bit adder

xy

xy

d

c

a.

d

c

b. z

z

Example 2.6 The second example is a parity bit generator. It implements an n-variable switching function parity(a0, a1, . . ., an1) ¼ 1 if, and only if, there is an odd number of 1s among variables a0, a1, . . ., an1. In other words, parityða0 ; a1 ; . . . ; an1 Þ ¼ ða0 þ a1 þ . . . þ an1 Þ mod 2 ¼ a0 a1 . . . an1 : Consider a communication system (Fig. 2.25) that must transmit 8-bit data d ¼ d0d1. . .d7 from a data source circuit to a data destination circuit. On the source size, an 8-bit parity generator generates an additional bit d8 ¼ d0 d1 . . . d7, and the nine bits d0d1. . .d7d8 are transmitted. Thus, the number of 1s among the transmitted bits d0, d1, . . ., d8 is always even. On the destination side, a 9-bit parity generator checks whether the number of 1s among d0, d1, . . ., d8 is even, or not. If even, the parity generator output is equal to 0; if odd, the output is equal to 1. If it is assumed that during the transmission at most one bit could have been modified, due to the noise on the transmission lines, the 9-bit parity generator output is an error signal equal to 0 if no error has happened and equal to 1 in the contrary case. An 8-bit parity generator and a 9-bit parity generator implemented with XOR2 gates are shown in Fig. 2.26. Example 2.7 The most common use of XOR gates is within adders. A 1-bit adder implements two switching functions z and d defined by Table 2.3 and by (2.19) and (2.20). According to Table 2.3, z can also be expressed as follows: z ¼ ðx þ y þ cÞmod2 ¼ x y c:

ð2:29Þ

On the other hand, d is equal to 1 if, and only if, x þ y þ c 2. This condition can be expressed in the following way: either x ¼ y ¼ 1 or c ¼ 1 and x 6¼ y. The corresponding Boolean expression is

2.4

Logic Gates

41

Fig. 2.28 Tristate buffer and tristate inverter symbols

c

c

x

y

Table 2.12 Definition of tristate buffer and tristate inverter

cx 00 01 10 11

Fig. 2.29 Symbols of tristate components with active-low control input

3-State buffer output y Z Z 0 1

3-State inverter output y Z Z 1 0

c

c

x

Table 2.13 Definition of tristate components with active-low control input

y

x

y

cx 00 01 10 11

3-State buffer output y 0 1 Z Z

d ¼ x y þ c ðx yÞ:

x

y

3-State inverter output y 1 0 Z Z

ð2:30Þ

The circuit that corresponds to (2.29) and (2.30) is shown in Fig. 2.27a. As mentioned above (Fig. 2.20), AND gates and OR gates can be implemented with NAND gates (Fig. 2.27b).

2.4.3

Tristate Buffers and Tristate Inverters

Tristate buffers and tristate inverters are components whose output can be in three different states: 0 (low voltage), 1 (high voltage), or Z (disconnected). A tristate buffer CMOS implementation is shown in Fig. 1.28a: when the control input c ¼ 0, the output is disconnected from the input, so that the output impedance is very high (infinite if leakage currents are not considered); if c ¼ 1, the output is connected to the input through a CMOS switch. A tristate inverter is equivalent to an inverter whose output is connected to a tristate buffer. It works as follows: when the control input c ¼ 0, the output is disconnected from the input; if c ¼ 1, the output is equal to the inverse of the input. The symbols of a tristate buffer and of a tristate inverter are shown in Fig. 2.28 and their working is defined in Table 2.12.

42

2 Combinational Circuits

circuit B

circuit A

cA

circuit C

cB HAB.

HAB.

HAB.

HAB.

HAB.

HAB.

HAB.

HAB.

Fig. 2.30 4-Bit bus

Table 2.14 4-Bit bus definition

cA cB 00 01 10 11

Data transmission None B!C A!C Not allowed

In some tristate components the control signal c is active at low level. The corresponding symbols and definitions are shown in Fig. 2.29 and Table 2.13. A typical application of tristate components is shown in Fig. 2.30. It is a 4-bit bus that permits to send 4-bit data either from circuit A to circuit C or from circuit B to circuit C. As an example, A could be a memory, B an input interface, and C a processor. Both circuits A and B must be able to send data to C but cannot be directly connected to C. To avoid collisions, 3-state buffers are inserted between A and B outputs and the set of wires connected to the circuit C inputs. To transmit data from A to C, cA ¼ 1 and cB ¼ 0, and to transmit data from B to C, cA ¼ 0 and cB ¼ 1 (Table 2.14).

2.5

Synthesis Tools

In order to efficiently implement combinational circuits, synthesis tools are necessary. In this section, some of the principles used to optimize combinational circuits are described.

2.5.1

Redundant Terms

When defining a switching function it might be that for some combinations of input variable values the corresponding output value is not defined because either those input value combinations never happen or because the function value does not matter. In the truth table, the corresponding entries are named “don’t care” (instead of 0 or 1). When defining a Boolean expression that describes the switching function to be implemented, the minterms that correspond to those don’t care entries can be used, or not, in order to optimize the final circuit.

2.5

Synthesis Tools

43

Fig. 2.31 BCD to 7-segment decoder

A

x3 x2

BCD to 7 segments

x1 x0

F

A B C D E F G

B

G

E

C

D

Table 2.15 BCD to 7-segment decoder definition Digit 0 1 2 3 4 5 6 7 8 9

x3x2x1x0 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

A 1 0 1 1 0 1 1 1 1 1 – – – – – –

B 1 1 1 1 1 0 0 1 1 1 – – – – – –

C 1 1 0 1 1 1 1 1 1 1 – – – – – –

D 1 0 1 1 0 1 1 0 1 0 – – – – – –

E 1 0 1 0 0 0 1 0 1 0 – – – – – –

F 1 0 0 0 1 1 1 0 1 1 – – – – – –

G 0 0 1 1 1 1 1 0 1 1 – – – – – –

Example 2.8 A BCD to 7-segment decoder (Fig. 2.31) is a combinational circuit with four inputs x3, x2, x1, and x0 that are the binary representation of a decimal digit (BCD means binary coded decimal) and seven outputs that control the seven segments of a display. Among the 16 combinations of x3, x2, x1, and x0 values, only 10 are used: those that correspond to digits 0–9. Thus, the values of outputs A to G that correspond to inputs 1010 to 1111 are unspecified (don’t care). The BCD to 7-segment decoder is defined by Table 2.15. If all don’t care entries are substituted by 0s, the following set of Boolean expressions is obtained: A ¼ x3 :x1 þ x3 :x2 :x0 þ x3 :x2 :x1 ;

ð2:31aÞ

B ¼ x3 :x2 þ x2 :x1 þ x3 :x1 :x0 þ x3 :x1 :x0 ;

ð2:31bÞ

C ¼ x2 :x1 þ x3 :x0 þ x3 : x2 ;

ð2:31cÞ

D ¼ x2 : x1 :x0 þ x3 :x2 :x1 þ x3 :x1 :x0 þ x3 :x2 :x1 :x0 ;

ð2:31dÞ

44

2 Combinational Circuits

Table 2.16 Another definition of function B

x3x2x1x0 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

B 1 1 1 1 1 0 0 1 1 1 1 1 1 0 0 1

E ¼ x2 :x1 :x0 þ x3 :x1 :x0 ;

ð2:31eÞ

F ¼ x3 :x1 :x0 þ x3 :x2 :x1 þ x3 :x2 :x0 þ x3 :x2 :x1 ;

ð2:31fÞ

G ¼ x3 :x2 :x1 þ x3 :x2 :x1 þ x3 :x2 :x0 þ x3 :x2 :x1 :

ð2:31gÞ

For example, B can be expressed as the sum of minterms m0, m1, m2, m3, m4, m7, m8, and m9: B ¼ x3 :x2 :x1 :x0 þ x3 :x2 :x1 :x0 þ x3 :x2 :x1 :x0 þ x3 :x2 :x1 :x0 þ x3 :x2 : x1 :x0 þ x3 :x2 :x1 :x0 þ x3 : x2 :x1 :x0 þ x3 : x2 :x1 :x0 : Then, the previous expression can be minimized: B ¼ x3 :x2 :ðx1 :x0 þ x1 :x0 þ x1 :x0 þ x1 :x0 Þ þ x3 :x2 : x1 :x0 þ x3 :x2 :x1 :x0 þ x3 :x2 :x1 :ðx0 þ x0 Þ ¼ x3 :x2 þ x3 :x2 : x1 :x0 þ x3 :x2 :x1 :x0 þ x3 :x2 :x1 ¼ x3 :x2 þ x3 :x2 :x1 :x0 þ x3 :x2 :x1 :x0 þ x3 :x2 :x1 þ x3 :x2 : x1 :x0 þ

ð2:32Þ

x3 :x2 :x1 :x0 þ x3 :x2 :x1 ¼ x3 :x2 þ x3 :ðx2 þ x2 Þ: x1 :x0 þ x3 :ðx2 þ x2 Þ:x1 :x0 þ ðx3 þ x3 Þ:x2 :x1 ¼ x3 :x2 þ x3 :x1 :x0 þ x3 :x1 :x0 þ x2 :x1 : By performing the same type of optimization for all other functions, the set of (2.31) has been obtained. In Table 2.16 the don’t care entries of function B have been defined in another way and a different Boolean expression is obtained: according to Table 2.16, B ¼ 1 if, and only if, x2 ¼ 0 or x1x0 ¼ 00 or 11; thus B ¼ x2 þ x1 :x0 þ x1 :x0 :

ð2:33Þ

Equations 2.32 and 2.33 are compatible with the initial specification (Table 2.15). They generate different values of B when x3x2x1x0 ¼ 1010, 1011, 1100, or 1111, but in those cases the value of B

2.5

Synthesis Tools

45

Table 2.17 Comparison between (2.31) and (2.34)

Gate type AND2 AND3 AND4 OR2 OR3 OR4 NOT

Number of gates (2.31) 6 17 1 1 2 4 4

Number of gates (2.34) 14 1 1 2 4 4

does not matter. On the other hand (2.33) is simpler than (2.32) and would correspond to a better implementation. By performing the same type of optimization for all other functions, the following set of expressions has been obtained: A ¼ x1 þ x2 :x0 þ x3 ;

ð2:34aÞ

B ¼ x2 þ x1 :x0 þ x1 :x0 ;

ð2:34bÞ

C ¼ x1 þ x0 þ x2 ;

ð2:34cÞ

D ¼ x2 :x0 þ x2 :x1 þ x1 :x0 þ x2 :x1 :x0 ;

ð2:34dÞ

E ¼ x2 :x0 þ x1 :x0 ;

ð2:34eÞ

F ¼ x1 :x0 þ x2 :x1 þ x2 :x0 þ x3 ;

ð2:34fÞ

G ¼ x2 :x0 þ x2 :x1 þ x1 :x0 þ x3 :

ð2:34gÞ

To summarize: • If the “don’t care” of Table 2.15 are replaced by 0s, the set of (2.31) is obtained. • If they are replaced by either 0 or 1, according to some optimization method (not described in this course), the set of (2.34) would have been obtained. In Table 2.17 the numbers of AND, OR, and NOT gates necessary to implement (2.31) and (2.34) are shown. The circuit that implements (2.31) has 6 2 þ 17 3 þ 1 4 þ 1 2 þ 2 3 þ 4 4 þ 4 1 ¼ 95 gate inputs and the circuit that implements (2.34) has 14 2 þ 1 3 þ 1 2 þ 2 3 þ 4 4 þ 4 1 ¼ 59 gate inputs. Obviously, the second circuit is better.

2.5.2

Cube Representation

A combinational circuit synthesis tool is a set of programs that generates optimized circuits, according to some criteria (cost, delay, power) starting either from logic expressions or from tables. The cube representation of combinational functions is an easy way to define Boolean expressions within a computer programming environment. The set of n-component binary vectors B2n can be considered as a cube (actually a hypercube) of dimension n. For example, if n ¼ 3, the set B23 of 3-component binary vectors is represented by the cube of Fig. 2.32a.

46

2 Combinational Circuits

Fig. 2.32 Cubes

110 010

010

011

100 000

110

111 011

100

101 001

000

001

a.

b.

Fig. 2.33 Solutions of x2 x0 ¼ 1

c.

1100 0100

1110 0110

A subset of B2n defined by giving a particular value to m vector components is a subcube B2nm of dimension n-m. As an example, the subset of vectors of B23 whose first coordinate is equal to 0 (Fig. 2.32b) is a cube of dimension 2 (actually a square). Another example: the subset of vectors of B23 whose first coordinate is 1 and whose third coordinate is 0 (Fig. 2.32c) is a cube of dimension 1 (actually a straight line). Consider a 4-variable function f defined by the following Boolean expression: f ðx3 ; x2 ; x1 ; x0 Þ ¼ x2 x0 : This function is equal to 1 if, and only if, x2 ¼ 1 and x0 ¼ 0, that is f ¼ 1 iff ðx3 ; x2 ; x1 ; x0 Þ 2 x 2 B2 4 x2 ¼ 1 and x0 ¼ 0 : In other words, f ¼ 1 if, and only if, (x3, x2, x1, x0) belongs to the 2-dimensional cube of Fig. 2.33. This example suggests another definition. Definition 2.2 A cube is a set of elements of B2n where a product of literals (Definitions 2.1) is equal to 1. In this chapter switching functions have been expressed under the form of sums of products of literals (e.g., (2.19) and (2.20)), and to those expressions correspond implementations by means of logic gates (e.g., Figs. 2.17 and 2.16). According to Definition 2.2, a set of elements of B2n where a product of literals is equal to 1 is a cube. Thus, a sum of product of literals can also be defined as a union of cubes that defines the set of points of B2n where f ¼ 1. In what follows cube and product of literals are considered as synonymous. How can a product of literals be represented within a computer programming environment? For that an order of the variables must be defined, for example (as above) xn1, xn2, . . ., x1, x0. Then, consider a product p of literals. It is represented by an n-component ternary vector ( pn1, pn2, . . ., p1, p0) where • pi ¼ 0 if xi is in p under inverted form (xi ). • pi ¼ 1 if xi is in p under non-inverted form (xi). • pi ¼ X if xi is not in p. Example 2.9 (with n ¼ 4) The set of cubes that describes (2.31d) is {X000, 001X, 0X10, 0101}, and the set of cubes that corresponds to (2.34g) is {X0X0, X10X, XX10, 1XXX}. Conversely, the product of literals represented by 1X01 is x3 x1 x0 and the product of literals represented by X1X0 is x2 x0 .

2.5

Synthesis Tools

47

Fig. 2.34 Union of cubes

110 010

110

111 010

011 100

101 001

a.

2.5.3

010

011 100

000

110

111

000

011 100

101 001

111

000

b.

101 001

c.

Adjacency

Adjacency is the basic concept that permits to optimize Boolean expressions. Two m-dimensional cubes are adjacent if their associated ternary vectors differ in only one position. As an example (n ¼ 3), the 1-dimensional cubes X11 (Fig. 2.34a) and X01 (Fig. 2.34b) are adjacent and their union is a 2-dimensional cube XX1 ¼ X11 [ X01 (Fig. 2.34c). The corresponding products of literals are the following: X11 represents x1 x0, X01 represents x1 x0 , and their union XX1 represents x0. In terms of products of literals, the union of the two adjacent cubes is the sum of the corresponding products: x1 x0 þ x1 x0 ¼ ðx1 þ x1 Þ x0 ¼ 1 x0 ¼ x0 : Thus, if a function f is defined by a union of cubes and if two cubes are adjacent, then they can be replaced by their union. The result, in terms of products of literals, is that two products of n-m literals are replaced by a single product of n m 1 literals. Example 2.10 A function f of four variables a, b, c, and d is defined by its minterms (Definition 2.1): f ða; b; c; d Þ ¼ a b c d þ a b c d þ a b c d þ a b c dþ a b c d þ a b c d: The corresponding set of cubes is f0010, 0011, 0101, 0110, 0111, 1000g: The following adjacencies permit to simplify the representation of f: 0010 [ 0011 ¼ 001X, 0110 [ 0111 ¼ 011X, 0101 [ 0111 ¼ 01X1: Thanks to the idempotence property (2.9) the same cube (0111 in this example) can be used several times. The simplified set of cubes is f001X, 011X, 01X1, 1000g: There remains an adjacency: 001X [ 011X ¼ 0X1X: The final result is

48

2 Combinational Circuits

f0X1X, 01X1, 1000g and the corresponding Boolean expression is f ¼ a c þ a b d þ a b c d: To conclude, a repeated use of the fact that two adjacent cubes can be replaced by a single cube permits to generate new Boolean expressions, equivalent to the initial one and with fewer terms. Furthermore the new terms have fewer literals. This is the basis of most automatic optimization tools. All commercial synthesis tools include programs that automatically generate optimal circuits according to some criteria such as cost, delay, or power consumption, and starting from several types of specification. For education purpose open-source tools are available, for example C. Burch (2005).

2.5.4

Karnaugh Map

In the case of switching functions of a few variables, a graphical method can be used to detect adjacencies and to optimize Boolean expressions. Consider the function f(a, b, c, d ) of Example 2.10. It can be represented by the Karnaugh map (Karnaugh 1953) of Fig. 2.35a. Observe the enumeration ordering of rows and columns (00, 01, 11, 10): the variable values that correspond to a row (a column) and to the next row (the next column) differ in only one position. 0101

01X1 0010

cd 00

01

0X10 11 10

ab 00

0

0

1

1

1

01

0

1

1

1

0

0

11

0

0

0

0

0

0

10

1

0

0

0

cd 00

01

11

10

cd 00

01

11

10

ab 00

0

0

1

1

ab 00

0

0

1

1

01

0

1

1

1

01

0

1

1

11

0

0

0

0

11

0

0

10

1

0

0

0

10

1

0

1000

a.

b.

c. 0XXX cd 00

01

11

10

ab 00

1

1

1

1

01

1

1

1

1

0

11

0

0

1

1

0

10

1

0

1

1

cd 00

01

11

10

ab 00

0

0

1

1

01

0

1

1

1

11

0

0

0

10

1

0

0

d. Fig. 2.35 Karnaugh maps

0X1X

e.

XX1X

2.5

Synthesis Tools

49

Fig. 2.36 Optimization of f

01X1 cd 00

01

11

10

ab 00

0

0

1

1

01

0

1

1

1

11

0

0

0

0

10

1

0

0

0

0X1X

1000

Fig. 2.37 Functions g and h

0X11

0X0 x1x0 00 x2 0

1

01

11

0

0

10

cd 00

01

11

10

1

00

0

0

1

0

01

0

1

1

1

11

1

0

0

0

10

1

1

0

1

011X

01X1 1

0

1

1

0 1X00

g

10X0

1X1 100X f

To each one of this graphical representation is associated a minterm of the function (a 0-dimensional cube). Several examples are shown in Fig. 2.35b. Thanks to the chosen enumeration ordering, to groups of two adjacent 1s like those of Fig. 2.35c are associated 1-dimensional cubes. To a group of four adjacent 1s like the one of Fig. 2.35d is associated a 2-dimensional cube. To groups of eight adjacent 1s like those of Fig. 2.35e (another switching function) are associated 3-dimensional cubes. Thus (Fig. 2.36) the function f(a, b, c, d ) of Example 2.10 can be expressed as the Boolean sum of three cubes 0X1X, 01X1, and 1000 so that f ¼ a c þ a b d þ a b c d: It is important to observe that the rightmost cells and the leftmost cells are adjacent, and so are also the uppermost cells and the downmost cells (as if the map were drawn on the surface of a torus). Two additional examples are given in Fig. 2.37. Function g of Fig. 2.37a can be expressed as the Boolean sum of two 1-dimensional cubes 0X0 and 1X1, so that g ¼ x2 x0 þ x2 x0 ; and function h of Fig. 2.37b can be expressed as the Boolean sum of six 1-dimensional cubes 01X1, 011X, 0X11, 10X0, 100X, and 1X00, so that h ¼ a b d þ a b c þ a c d þ a b d þ a b c þ a c d:

50

2 Combinational Circuits

Fig. 2.38 Propagation time tp

a

a

z

b

b z

tp

a.

Fig. 2.39 Example of propagation time computation

b.

c a b c d

τ NOT(c)

τ τ

τ

z

d·NOT(c)

e z 3τ

a.

2.6

b.

Propagation Time

Logic components such as gates are physical systems. Any change of their state, for example the output voltage transition from some level to another level, needs some quantity of energy and therefore some time (zero delay would mean infinite power). Thus, apart from their function (AND2, OR3, NAND4, and so on), logic gates are also characterized by their propagation time (delay) between inputs and outputs. Consider a simple NOR2 gate (Fig. 2.38a). Assume that initially a ¼ b ¼ 0. Then z ¼ NOR (0, 0) ¼ 1 (Fig. 2.38b). When b rises from 0 to 1, then NOR(0, 1) ¼ 0 and z must fall from 1 to 0. However the output state change is not immediate; there is a small delay tp generally expressed in nanoseconds (ns) or picoseconds (ps). Example 2.11 The circuit of Fig. 2.39a implements a 5-variable switching function z ¼ a b þ c d þ e. Assume that all components (AND2, NOT, OR3) have the same propagation time τ ns. Initially a ¼ 0, b is either 0 or 1, c ¼ 1, d ¼ 1, and e ¼ 0. Thus z ¼ 0 b þ 1 1 þ 0 ¼ 0. If c falls from 1 down to 0 then the new value of z must be z ¼ 0 b þ 0 1 þ 0 ¼ 1. However this output state change takes some time: the inverter output c changes after τ ns; the AND2 output c d changes after 2τ ns, and the OR3 output z changes after 3τ ns (Fig. 2.39b). Thus, the propagation time of a circuit depends on the component propagation times but also on the circuit itself. Two different circuits could implement the same switching circuit but with different propagation times.

2.6

Propagation Time

51 abcde

a b c d e

8-input OR gate

z

k g

z

h i j

kgh i j

a.

b.

Fig. 2.40 Two circuits that implement the same function f Fig. 2.41 n-Bit comparator

X Y

n n

n-bit comparator

G L E

Example 2.12 The two following expressions define the same switching function z: z ¼ ða þ bÞ ðc þ dÞ e þ ðk þ gÞ ðh þ iÞ j, z ¼ a c e þ a d e þ b c e þ b d e þ k h j þ k i j þ g h j þ g i j: The corresponding circuits are shown in Fig. 2.40a, b. The circuit of Fig. 2.40a has 7 gates and 16 gate inputs while the circuit of Fig. 2.40b has 9 gates and 32 gate inputs. On the other hand, if all gates are assumed to have the same propagation time τ ns, then the circuit of Fig. 2.40a has a propagation time equal to 3τ ns while the circuit of Fig. 2.40b has a propagation time equal to 2τ ns. Thus, the circuit of Fig. 2.40a could be less expensive in terms of number of transistors but with a longer propagation time than the circuit of Fig. 2.40b. In function of the system specification, the designer will have to choose between a faster but more expensive implementation or a slower and cheaper implementation (speed vs. cost balance). A more realistic example is now presented. An n-bit comparator (Fig. 2.41) is a circuit with two nbit inputs X ¼ xn1xn2 . . . x0 and Y ¼ yn1yn2 . . . y0 that represent two naturals and three 1-bit outputs G (greater), L (lower), and E (equal). It works as follows: G ¼ 1 if X > Y, otherwise G ¼ 0; L ¼ 1 if X < Y, otherwise L ¼ 0; E ¼ 1 if X ¼ Y, otherwise E ¼ 0. A step-by-step algorithm can be used. For that, the pairs of bits (xi, yi) are sequentially explored starting from the most significant bits (xn1, yn1). Initially G ¼ 0, L ¼ 0, and E ¼ 1. As long as xi ¼ yi, the values of G, L, and E do not change. When for the first time xi 6¼ yi, there are two possibilities: if xi > yi then G ¼ 1, L ¼ 0, and E ¼ 0, and if if xi < yi then G ¼ 0, L ¼ 1, and E ¼ 0. From this step, the values of G, L, and E do not change any more.

52

2 Combinational Circuits

Table 2.18 Magnitude comparison X Y G L E

1 1 0 0 1

0 0

0 0 0 0 1

1 1 0 0 1

1 0 1 0 0

0 or 1 0 or 1 1 0 0

xn-1 yn-1

xn-2 yn-2

xn-3 yn-3

1-bit comparator

1-bit comparator

1-bit comparator

0 or 1 0 or 1 1 0 0

0 or 1 0 or 1 1 0 0

0 or 1 0 or 1 1 0 0

x0 y1

···

1-bit comparator

G L E

Fig. 2.42 Comparator structure

Algorithm 2.2 Magnitude Comparison G ¼ 0; L ¼ 0; E ¼ 1; for i in n-1 downto 0 loop if E ¼ 1 and xi> yithen G ¼ 1; L ¼ 0; E ¼ 0; elsif E ¼ 1 and xi< yithen G ¼ 0; L ¼ 1; E ¼ 0; end if; end loop;

This method is correct because in binary the weight of bits xi and yi is 2i and is greater than 2 þ 2i2 þ . . . þ 20 ¼ 2i 1. An example of computation is given in Table 2.18 with n ¼ 8, X ¼ 1011---- and Y ¼ 1010----. The corresponding circuit structure is shown in Fig. 2.42. Obviously E ¼ 1 if G ¼ 0 and L ¼ 0 so that E ¼ NOR (G, L). Every block (Fig. 2.43) executes the loop body of Algorithm 2.2 and is defined by the following Boolean expressions where Ei ¼ Gi Li : i1

Gi1 ¼ Ei xi yi þ Ei Gi ¼ Gi Li xi yi þ ðGi þ Li Þ Gi ¼ Li xi yi þ Gi ;

ð2:35Þ

Li1 ¼ Ei xi yi þ Ei Li ¼ Gi Li xi yi þ ðGi þ Li Þ Li ¼ Gi xi yi þ Li

ð2:36Þ

The circuit that implements (2.35) and (2.36) is shown in Fig. 2.44. It contains 8 gates (including the inverters) and 14 gate inputs, and the propagation time is 3τ ns assuming as before that all components (NOT, AND3, and OR2) have the same delay τ ns. The complete n-bit comparator (Fig. 2.42) contains 8n þ 1 gates and 14n þ 2 gate inputs and has a propagation time equal to (3n þ 1)τ ns. Instead of reading the bits of X and Y one at a time, consider an algorithm that reads two bits of X and Y at each step. Assume that n ¼ 2 m. Then the following Algorithm 2.3 is similar to Algorithm 2.2. The difference is that two successive bits x2jþ1 and x2j of X and two successive bits y2jþ1 and y2j of Y are considered. Those pairs of bits can be interpreted as quaternary digits (base-4 digits).

2.6

Propagation Time

53

Fig. 2.43 1-Bit comparator

xi Gi Li

Fig. 2.44 1-Bit comparator implementation

yi

1-bit comparator

Li xi yi

Gi-1 Li-1

Gi-1

Gi

Li-1

xn-1 xn-2 yn-1 yn-2 xn-3 xn-4 yn-3 yn-4 xn-5 xn-6 yn-5 yn-6 0 0

2-bit comparator

2-bit comparator

2-bit comparator

x1 x0 y1 y0

···

G L

2-bit comparator

E

Fig. 2.45 Comparator structure (version 2) Fig. 2.46 2-Bit comparator

xi xi-1 Gj Lj

yi yi-1

2-bit comparator

Gj-1 Lj-1

Algorithm 2.3 Magnitude Comparison, Version 2 G ¼ 0; L ¼ 0; E ¼ 1; for j in m-1 downto 0 loop if E ¼ 1 and x2jþ1 2j> y2jþ1 2jthen G ¼ 1; L ¼ 0; E ¼ 0; elsif E ¼ 1 and x2jþ1 2j< y2jþ1 2jthen G ¼ 0; L ¼ 1; E ¼ 0; end if; end loop;

The corresponding circuit structure is shown in Fig. 2.45. Every block (Fig. 2.46) executes the loop body of Algorithm 2.3 and is defined by Table 2.19 to which correspond the following equations: Gj1 ¼ Lj xi1 yi yi1 þ Lj xi yi þ Lj xi xi1 yi1 þ Gj ;

ð2:37Þ

Lj1 ¼ Gj yi1 xi xi1 þ Gj yi xi þ Gj yi yi1 xi1 þ Lj ;

ð2:38Þ

where i ¼ 2j þ 1.

54

2 Combinational Circuits

Table 2.19 2-Bit comparator definition Gj 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1

Lj 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1

xi 0 0 0 0 0 0 1 1 1 1 1 1 – – –

xi1 0 0 0 1 1 1 0 0 0 1 1 1 – – –

Fig. 2.47 2-Bit comparator implementation

yi 0 1 – 0 0 1 0 1 1 0 – 1 – – –

yi1 0 – 1 0 1 – – 0 1 – 0 1 – – –

Lj

Gj

Gj1 0 0 0 1 0 0 1 0 0 1 1 0 0 1 –

Lj1 0 1 1 0 0 1 0 0 1 0 0 0 1 0 –

xi xi-1 yi yi-1

Gj-1

Lj-1

Table 2.20 Comparison between the circuits of Figs. 2.42 and 2.45 Circuit Figure 2.42 Figure 2.45

Gates 8n þ 1 7n þ 1

Gate inputs 14n þ 2 18n þ 2

Propagation time (3n þ 1)τ (1.5n þ 1)τ

The circuit that implements (2.37) and (2.38) is shown in Fig. 2.47. It contains 14 gates (including the inverters) and 36 gate inputs, and the propagation time is 3τ ns assuming as before that all components (NOT, AND3, AND4, and OR4) have the same delay τ ns. The complete n-bit comparator (Fig. 2.45), with n ¼ 2m, contains 14m þ 1 ¼ 7n þ 1 gates and 36m þ 2 ¼ 18n þ 2 gate inputs and has a propagation time equal to (3m þ 1)τ ¼ (1.5n þ 1)τ ns. To summarize (Table 2.20) the circuit of Fig. 2.45 has fewer gates, more gate inputs, and a shorter propagation time than the circuit of Fig. 2.42 (roughly half the propagation time).

2.7

Other Logic Blocks

2.7

55

Other Logic Blocks

Apart from the logic gates, some other components are available and can be used to implement combinational circuits.

2.7.1

Multiplexers

The circuit of Fig. 2.48a is a 1-bit 2-to-1 multiplexer (MUX2-1). It has two data inputs x0 and x1, a control input c, and a data output y. It works as follows (Fig. 2.48b): when c ¼ 0 the data output y is connected to the data input x0 and when c ¼ 1 the data output y is connected to the data input x1. So, the main function of a multiplexer is to implement controllable connections. A typical application is shown in Fig. 2.49: the input of circuit_C can be connected to the output of either circuit_A or circuit_B under the control of signal control: • If control ¼ 0, circuit_C input ¼ circuit_A output. • If control ¼ 1, circuit_C input ¼ circuit_B output. More complex multiplexers can be defined. An m-bit 2n-to-1 multiplexer has 2nm-bit data inputs x0, x1, . . ., x2n 1 , an n-bit control input c, and an m-bit data output y. It works as follows: if c is equal to the binary representation of natural i, then y ¼ xi. Two examples are given in Fig. 2.50: in Fig. 2.50a the symbol of an m-bit 2-to-1 multiplexer is shown, and in Fig. 2.50b the symbol and the truth table of a 1-bit 4-to-1 multiplexer (MUX4-1) are shown. Fig. 2.48 1-Bit 2-to-1 multiplexer

c x0

c 0 1

0 y

x1

1

y x0 x1

a.

Fig. 2.49 Example of controllable connection

b.

control

circuit_A

0 circuit_C 1 circuit_B

Fig. 2.50 Examples of multiplexers

c1 c0

c x0

m

x1

m

x0

0 m

1

a.

y

x1 x2 x3

00 01 10 11

y

b.

c1 c0

y

00 01 10 11

x0 x1 x2 x3

56

2 Combinational Circuits c0 c1 c0 2 x01 x00 x11 x10

2

00

2

x21 x20

2

x31 x30

2

01

2

x01

0

x11

1

c0

c1

0

x10

1

c1

0

y1 y0

10

x00

0 y0

y1 1

11

a.

1

x21

0

x20

0

x31

1

x30

1

b. Fig. 2.51 2-Bit MUX4-1 implemented with six 1-bit MUX2-1 Fig. 2.52 MUX2-1 is a universal module

a 0

a b

0

a

0

b

1

1

1

1

0 0 0 0 1 1 1 1

0 0 0 1 0 1 1 1

1

0

1

1

1

0

0 1

0

1

1

0

0

0

1

1

0

1

0

1

0

1

1

0

0

0

0

0

0 1 1 0 1 0 0 1

0

1

x0

x0 y1 y0

0 1 0 1 0 1 0 1

0 NOT(a)

1

x2 x1 x0 0 0 1 1 0 0 1 1

1 a+b

a·b

1

x1 x2

a.

y1

x1

1

0

0

1

0

1

0

1 0

0

1

1

0

x2 y0

b.

Fig. 2.53 Implementation of two 3-variable switching functions

In fact, any multiplexer can be built with 1-bit 2-to-1 multiplexers. For example, Fig. 2.51a is the symbol of a 2-bit MUX4-1 and Fig. 2.51b is an implementation consisting of six 1-bit MUX2-1. Multiplexers can also be used to implement switching functions. The function executed by the 1-bit MUX2-1 of Fig. 2.48 is y ¼ c x0 þ c x1 :

ð2:39Þ

In particular, MUX2-1 is a universal module (Fig. 2.52): • If c ¼ a, x0 ¼ 0 and x1 ¼ b, then y ¼ a b. • If c ¼ a, x0 ¼ b and x1 ¼ 1, then y ¼ a b þ a ¼ a þ b. • If c ¼ a, x0 ¼ 1 and x1 ¼ 0, then y ¼ a. Furthermore, any switching function of n variables can be implemented by a 2n-to-1 multiplexer. As an example, consider the 3-variable switching functions y1 and y0 of Fig. 2.53a. Each of them can be implemented by a MUX8-1 that in turn can be synthesized with seven MUX2-1 (Fig. 2.53b). The

2.7

Other Logic Blocks

57

Fig. 2.54 Optimization rules

x

x 1

1

0

0

x

x

≡

a

1

a

1

b

0

b

0

≡ a.

Fig. 2.55 Optimized circuits

a

1

b

0

x0

x1 1 x0

b.

x1 1

1 x2

0 1

y1

0

1

1

0

1 y0 0

0

0

x2

0

1

1

0

0

three variables x2, x1, and x0 are used to control the connection of the output (y1 or y0) to a constant value as defined in the function truth table. In many cases the circuit can be simplified using simple and obvious rules. Two optimization rules are shown in Fig. 2.54. In Fig. 2.54a if x ¼ 0 then the multiplexer output is equal to 0 and if x ¼ 1 then the multiplexer output is equal to 1. Thus the multiplexer output is equal to x. In Fig. 2.54b two multiplexers controlled by the same variable x and with the same data inputs can be replaced by a unique multiplexer. An optimized version of the circuits of Fig. 2.53b is shown in Fig. 2.55. Two switching function synthesis methods using multiplexers have been described. The first is to use multiplexers to implement the basic Boolean operations (AND, OR, NOT), which is generally not a good idea, rather a way to demonstrate that MUX2-1 is a universal module. The second is the use of an m-bit 2n-to-1 multiplexer to implement m functions of n variables. In fact, an m-bit 2n-to-1 multiplexer with all its data inputs connected to constant values implements the same function as a ROM storing 2nm-bit words. Then the 2n-to-1 multiplexers can be synthesized with MUX2-1 and the circuits can be optimized using rules such as those of Fig. 2.54. A more general switching function synthesis method with MUX2-1 components is based on (2.39) and on the fact that any n-variable switching function f(x0, x1, . . ., xn1) can be expressed under the form f ðx0 ; x1 ; . . . ; xn1 Þ ¼ x0 f 0 ðx1 ; . . . ; xn1 Þ þ x0 f 1 ðx1 ; . . . ; xn1 Þ

ð2:40Þ

where f 0 ðx1 ; . . . ; xn1 Þ ¼ f ð0; x1 ; . . . ; xn1 Þ and f 1 ðx1 ; . . . ; xn1 Þ ¼ f ð1; x1 ; . . . ; xn1 Þ

ð2:41Þ

are functions of n 1 variables. The circuit of Fig. 2.56 is a direct consequence of (2.40) and (2.41). In this way variable x0 has been extracted.

58

2 Combinational Circuits

Fig. 2.56 Extraction of variable x0

x0 f0 (x1, ···, xn-1)

0

f1 (x1, ···, xn-1)

1

f (x0, x1, ···, xn-1)

Fig. 2.57 MUX2-1 implementation of f

x3 1

x1

0

0 0

x0 0

1 x2

f

1 1 x2

1

0 1

Then a similar variable extraction can be performed with functions f0 and f1 (not necessarily the same variable) so that functions of n 2 variables are obtained, and so on. Thus, an iterative extraction of variables finally generates constants (0-variable functions), variables, or already generated functions. Example 2.13 Use the variable extraction method to implement the following 4-variable function: f ¼ x0 x1 x3 þ x0 x1 x2 þ x0 x2 þ x0 x3 : First extract x0: f 0 ¼ x1 x3 þ x1 x2 and f 1 ¼ x2 þ x3 : Then extract x1 from f0: f 00 ¼ x3 and f 01 ¼ x2 : Extract x2 from f1: f 10 ¼ 1 and f 11 ¼ x3 : It remains to synthesize x3 ¼ x3 1 þ x3 0. The circuit is shown in Fig. 2.57.

2.7.2

Multiplexers and Memory Blocks

ROM blocks can be used to implement switching functions defined by their truth table (Sect. 2.2) but in most cases it is a very inefficient method. However the combined use of small ROM blocks, generally called LUT, and of multiplexers permits to define efficient circuits. This is a commonly used technique in field programmable devices such as FPGAs (Chap. 7). Assume that 6-input LUTs (LUT6) are available. Then the variable extraction method of Fig. 2.56 can be iteratively applied up to the step where all obtained functions depend on at most six variables. As an example, the circuit of Fig. 2.58 implements any function of eight variables. Observe that the rightmost part of the circuit of Fig. 2.58 synthesizes a function of six variables: x6, x7 and the four LUT6 outputs. An alternative circuit consisting of five LUT6 is shown in Fig. 2.59. Figure 2.59 suggests a variable extraction method in which two variables are extracted at each step. It uses the following relation:

2.7

Other Logic Blocks

59

Fig. 2.58 Implementation of an 8-variable switching function

x0x1x2x3x4x5

x6

LUT6

0 x7

1 LUT6

0 f (x0, x1, ··· ,x7) 1

LUT6 0 1

LUT6

Fig. 2.59 Alternative circuit

x6 x7 x0x1x2x3x4x5

LUT6

LUT6

f (x0, x1, ··· ,x7)

LUT6

LUT6

LUT6

Fig. 2.60 Extraction of variables x0 and x1

x0 x1 f00 (x2, ··· , xn-1)

LUT6

f (x0, x1,x2, ··· , xn-1)

f01 (x2, ··· , xn-1) f10 (x2, ··· , xn-1) f11 (x2, ··· , xn-1)

f ðx0 , x1 , . . . , xn1 Þ ¼ x0 x1 f 00 ðx2 ; . . . ; xn1 Þ þ x0 x1 f 01 ðx2 , . . . , xn1 Þ þ x0 x1 f 10 ðx2 ; . . . ; xn1 Þ þ x0 x1 f 11 ðx2 , . . . , xn1 Þ:

ð2:42Þ

where f 00 ðx2 ; . . . ; xn1 Þ ¼ f ð0, 0, x2 , . . . , xn1 Þ, f 01 ðx2 ; . . . ; xn1 Þ ¼ f ð0, 1, x2 , . . . , xn1 Þ, f 10 ðx2 ; . . . ; xn1 Þ ¼ f ð1, 0, x2 , . . . , xn1 Þ, f 11 ðx2 ; . . . ; xn1 Þ ¼ f ð1, 1, x2 , . . . , xn1 Þ are functions of n 2 variables. The corresponding variable extraction circuit (Fig. 2.60) is a LUT6 that implements a function of six variables x0, x1, f00, f01, f10, and f11 equal to x0 x1 f 00 þ x0 x1 f 01 þ x0 x1 f 10 þ x0 x1 f 11 : Then a similar variable extraction can be performed with functions f00, f01, f10, and f11 so that functions of n 4 variables are obtained, and so on.

60

2 Combinational Circuits

Fig. 2.61 AND plane and OR plane

y0 y1

y0 y1

x0 x1 ···

AND

xn-1

OR

···

···

··· zs-1

yp-1

yp-1

a. Fig. 2.62 Switching function implementation with two planes

z0 z1

b. x0 x1

AND

···

z0 z1

OR

···

···

xn-1

Fig. 2.63 Address decoders

y0 xn-1

y1

··· ···

x1 x0

x1 x0

n

y2 - 1

zs-1

y0

x1 x0

y0

y1

y2

y3

y1

0

0

1

0

0

0

0 1 1

1 0 1

0 0 0

1 0 0

0 1 0

0 0 1

y2 y3

a.

2.7.3

b.

Planes

Sometimes AND planes and OR planes are used to implement switching functions. An (n, p) AND plane (Fig. 2.61a) implements p functions yj of n variables, where yj is a product of literals (variable or inverse of a variable): yj ¼ wj, 0 wj, 1 . . . wj, n 1 where wj, i 2 f1; xi ; xi g: An ( p, s) OR plane (Fig. 2.61b) implements s functions zj of p variables, where zj is a Boolean sum of variables: zj ¼ wj , 0 þ wj , 1 þ . . . þ wj, p 1 where wj, i 2 f0; yi g: Those planes can be configured when the corresponding integrated circuit (IC) is manufactured, or can be programmed by the user in which case they are called field programmable devices. Any set of s switching functions that are expressed as Boolean sums of at most p products of at most n literals can be implemented by a circuit made up of an (n, p) AND plane and an ( p, s) OR plane (Fig. 2.62): the AND plane generates p products of at most n literals and the OR plane generates s sums of at most p terms. Depending on the manufacturing technology and on the manufacturer those AND-OR plane circuits receive different names such as programmable array of logic (PAL), Programmable Logic Array (PLA), Programmable Logic Device (PLD), and others.

2.7.4

Address Decoder and Tristate Buffers

Another type of useful component is the address decoder. An n-to-2n address decoder (Fig. 2.63a) has n inputs and 2n outputs and its function is defined as follows: if xn1xn2 . . . x0 is the binary

2.7

Other Logic Blocks

61

Fig. 2.64 AND-OR plane implementation of a ROM

OR x1·x0 x1·x0

x1 AND x0

x1 x0

x1·x0 x1·x0

0 1 1 0

1 1 0 1

0 1 0 0

z2

z1

z0

b.

a. z2

Fig. 2.65 4-Bit MUX4-1 implemented with an address decoder and four tristate buffers

z1

z0

y0 4 x1·x0 x1 x0

x4

x1·x0 x1·x0 x1·x0

y1 4 x4

y2 4 x4

y3 4 x4 4

z

representation of natural i, then yi ¼ 1 and all other outputs yj ¼ 0. As an example, a 2-to-4 address decoder and its truth table are shown in Fig. 2.63b. In fact, an n-to-2n address decoder implements the same function as an (n, 2n) AND plane that generates all n-variable minterms: mj ¼ wj, 0 wj, 1 . . . wj, n 1 where wj, i 2 fxi ; xi g: By connecting an n-to-2n address decoder to an (2n, s) OR plane, the obtained circuit implements the same function as a ROM storing 2ns-bit words. An example is given in Fig. 2.64a: the AND plane synthesizes the functions of a 2-to-4 address decoder and the complete circuit implements the same functions as the ROM of Fig. 2.64b. The other common application of address decoders is the control of data buses. An example is given in Fig. 2.65: a 2-to-4 address decoder generates four signals that control four 4-bit tristate buffers. This circuit permits to connect a 4-bit output z to one among four 4-bit inputs y0, y1, y2, or y3 under the control of two address bits x1 and x0. Actually, the circuit of Fig. 2.65 realizes the same function as a 4-bit MUX4-1. In Fig. 2.66a the circuit of Fig. 2.65 is used to connect one among four data sources (circuits A, B, C, and D) to a data destination (circuit E) under the control of two address bits. It executes the following algorithm: case x1 x0 is when 00 ¼> circuit_E ¼ circuit_A; when 01 ¼> circuit_E ¼ circuit_B; when 00 ¼> circuit_E ¼ circuit_C; when 00 ¼> circuit_E ¼ circuit_D; end case;

The usual symbol of this bus is shown in Fig. 2.66b.

62

2 Combinational Circuits

circ.A circ.B circ.C circ.D

4 x1·x0 x1

x1·x0

x0

x1·x0

x1·x0

x4

4 x4

circ.A circ.B circ.C circ.D

4 x4

4 x4 circ.E

4

circ.E

b.

a. Fig. 2.66 A 4-bit data bus with four data sources

2.8

Programming Language Structures

The specification of digital systems by means of algorithms (Sect. 1.2.1) is a central aspect of this course. In this section the relation between some programming language instructions and digital circuits is analyzed. This relation justifies the use of hardware description languages (HDL) similar to programming languages, as well as the generation of synthesis tools able to translate HDL descriptions to circuits.

2.8.1

If Then Else

A first example of instruction that can be translated to a circuit is the conditional branch: if a_condition then some_actions else other_actions;

As an example, consider the following binary decision algorithm. It computes the value of a switching function f of six variables x0, x1, y0, y1, y2, and y3. Algorithm 2.4 if x1 ¼ 0 then if x0 ¼ 0 then f ¼ y0; else f ¼ y1; end if; else if x0 ¼ 0 then f ¼ y2; else f ¼ y3; end if; end if;

This function can be implemented by the circuit of Fig. 2.67 in which the external conditional branch is implemented by the rightmost MUX2-1 and the two internal conditional branches are implemented by the leftmost MUX2-1s.

2.8

Programming Language Structures

Fig. 2.67 Binary decision algorithm implementation

63 x0 y0

0

y1

1

x1 0 f 1

y2

1

y3

0

Fig. 2.68 Case instruction implementation

2.8.2

x y0

00

y1

01

y2

10

y3

11

f

Case

A second example of instruction that can be translated to a circuit is the conditional switch: case variable_identifier is when variable_value1 ¼> actions1; when variable_value2 ¼> actions2; end case;

As an example, the preceding binary decision algorithm (Algorithm 2.4) is equivalent to the following, assuming that x has been previously defined as a 2-bit vector (x0, x1). Algorithm 2.5 case x is when 00 ¼> f ¼ y0; when 01 ¼> f ¼ y1; when 10 ¼> f ¼ y2; when 11 ¼> f ¼ y3; end case;

Function f can be implemented by a MUX4-1 (Fig. 2.68).

2.8.3

Loops

For-loops are a third example of easily translatable construct: for variable_identifier in variable_range loop operations using the variable value end loop;

64

2 Combinational Circuits

x

y

x3

y3

x

y

x2

y2

x

y

x1

y1

x

y

x0

y0

x

y

cyOUT cyIN

cyOUT cyIN

cyOUT cyIN

cyOUT cyIN

cyOUT cyIN

z

z

z

z

z

z3

z2

z1

z0

z4

a.

0

b.

Fig. 2.69 4-Digit decimal adder

To this type of instruction can often be associated an iterative circuit. As an example, consider the following addition algorithm that computes z ¼ x þ y where x and y are 4-digit decimal numbers, so that z is a 5-digit decimal number. Algorithm 2.6 Addition of Two 4-Digit Naturals cy0¼ 0; for i in 0 to 3 loop --------- loop body: si¼ xiþ yiþ cyi; if si> 9 then zi¼ si- 10; cyiþ1¼ 1; else zi¼ si; cyiþ1¼ 0; end if; --------- end of loop body: end loop; z4¼ cy4;

The corresponding circuit is shown in Fig. 2.69b. It is an iterative circuit that consists of four identical blocks. Each of them is a 1-digit adder (Fig. 2.69a) that implements the loop body of Algorithm 2.6. Comments 2.5 • Other (not combinational but sequential) loop implementation methods will be studied in Chap. 4. • Not any loop can be implemented by means of an iterative combinational circuit. Consider a while-loop: while a_condition loop operations end loop;

The loop body is executed as long as some condition (that can be modified by the operations) is true. If the maximum number of times that the condition will be true is either unknown or is a too large number, a sequential implementation (Chap. 4) must be considered.

2.8

Programming Language Structures

Fig. 2.70 Implementation of procedure calls

65 w(1) = 0 a

w(2)

a

x(1) y(1) b MAC d

c

x(2) y(2) b MAC d

c

······· w(8)

a

x(8) y(8)

b MAC d

c

z= w(9)

2.8.4

Procedure Calls

Procedure (or function) calls constitute a fundamental aspect of well-structured programs and can be associated to hierarchical circuit descriptions. The following algorithm computes z ¼ x1 y1 þ x2 y2 þ . . . þ x8 y8. For that it makes several calls to a previously defined procedure multiply and accumulate (MAC) to which it passes four parameters a, b, c, and d. The procedure call MAC(a, b, c, d) executes d ¼ a þ b c. Algorithm 2.7 z ¼ x1 y1 þ x2 y2 þ . . . þ x8 y8 w(1) ¼ 0; for i in 1 to 8 loop MAC(w(i), x(i), y(i), w(iþ1)); end loop; z ¼ w(9);

Thus w2 ¼ 0 þ x1 y1 ¼ x1 y1 , w2 ¼ x1 y1 þ x2 y2 , w3 ¼ x1 y1 þ x2 y2 þ x3 y3 , . . . , z ¼ w9 ¼ x1 y1 þ x2 y2 þ x3 y3 þ . . . þ x8 y8 : The corresponding circuit is shown in Fig. 2.70. Algorithm 2.7 is a for-loop to which is associated an iterative circuit. The loop body is a procedure call to which corresponds a component MAC whose functional specification is d ¼ a þ b c. This is an example of top-down hierarchical description: an iterative circuit structure (the top level) whose components are defined by their function and afterwards must be implemented (the down level).

66

2 Combinational Circuits

2.8.5

Conclusion

There are several programming language constructs that can easily be translated to circuits. This fact justifies the use of formal languages to specify digital circuits, either classical programming languages such as C/Cþþ or specific HDL such as VHDL or Verilog. In this course VHDL will be used (Appendix A). The relation between programming language instructions and circuits also explains why it has been possible to develop software packages able to synthesize circuits starting from functional descriptions in some formal language.

2.9

Exercises

1. Synthesize with logic gates the function z of Table 2.3. 2. Generate Boolean expressions of functions f, g, and h of three variables x2, x1, and x0 defined by the following table: x2x1x0 000 001 010 011 100 101 110 111

f 1 – 0 1 – 1 0 0

g 0 – 0 1 1 1 – 0

h 1 1 – – 0 0 1 –

3. Simplify the following sets of cubes (n ¼ 4): f0000, 0010, 01x1, 0110, 1000: 1010g, f0001, 0011, 0100, 0101, 1100, 1110, 1011, 1010g, f0000, 0010, 1000, 1010, 0101, 1101, 1111g: 4. The following circuit consists of seven identical components with two inputs a and b, and two outputs c and d. The maximum propagation time from inputs a or b to outputs c or d is equal to 0.5 ns. What is the maximum propagation time from any input to any output (in ns)?

c

c

a d

b

c

a d a d

b

c

b

c

a d a d

b

c

b

c

a d a d

b

b

References

67

5. Compute an upper bound Nmax and a lower bound Nmin of the number N of functions that can be implemented by the following circuit.

x1x0 x5 x4 x3 x2

LUT

x5 x4 x3 x2

LUT

LUT

f

6. Implement with MUX2-1 components the switching functions of three variables x2, x1, and x0 defined by the following set of cubes: f11x, 101, 011g, f111, 100, 010, 001g, f1x1, 0x1g: 7. What set of cubes defines the function f(x5, x4, x3, x2, x1, x0) implemented by the following circuit? x5 x4 x3 x1 x0

x2

f

8. Minimize the following Boolean expression: f ða; b; c; dÞ ¼ a:b:c:d þ a:b þ a:b:c þ a:b þ a:c: 9. Implement the circuits of Figs. 2.14, 2.16, and 2.17 with NAND gates. 10. Implement (2.34) with NAND gates.

References Burch C (2005) Logisim. http://www.cburch.com/logisim/es/index.html Karnaugh M (1953) The map method for synthesis of combinational logic circuits. Trans Inst Electr Eng (AIEE) Part I 72(9):593–599

3

Arithmetic Blocks

Arithmetic circuits are an essential part of many digital circuits and thus deserve a particular treatment. In this chapter implementations of the basic arithmetic operations are presented. Only operations with naturals (nonnegative integers) are considered. A much more detailed and complete presentation of arithmetic circuits can be found in Parhami (2000), Ercegovac and Lang (2004), Deschamps et al. (2006), and Deschamps et al. (2012).

3.1

Binary Adder

Binary adders have already been described several times (e.g., Figs. 2.5 and 2.27). Given two n-bit naturals x and y and an incoming carry bit cy0, an n-bit adder computes an (n + 1)-bit number s ¼ x + y + cy0, in which sn can be used as an outgoing carry bit cyn. The classical pencil and paper algorithm, adapted to the binary system, is the following: Algorithm 3.1 Binary Adder: s ¼ x + y + cy0 for i in 0 to n-1 loop si ¼ xi xor yi xor cyi; cyi+1 ¼ (xi and yi)) or (xi and cyi)or (yi and cyi); end loop; sn ¼ cyn;

At each step si ¼ ðxi þ yi þ cyi Þ mod 2 ¼ xi yi cyi ;

ð3:1Þ

and cyi+1 ¼ 1 if, and only if, at least two bits among xi, yi, and cyi are equal to 1, a condition that can be expressed as follows: siþ1 ¼ xi yi þ xi cyi þ yi cyi :

ð3:2Þ

The circuit that implements Algorithm 3.1 (a for-loop) is shown in Fig. 3.1. It consists of n identical blocks called full adders (FA) that implement the loop body of Algorithm 3.1 (3.1 and 3.2).

# Springer International Publishing Switzerland 2017 J.-P. Deschamps et al., Digital Systems, DOI 10.1007/978-3-319-41198-9_3

69

70

3 x1

xn-1 yn-1

Fig. 3.1 n-Bit adder

sn = cyn

FA

cyn-1

....

cy2

sn-1

3.2

x0

y1

FA

cy1

s1

Arithmetic Blocks

y0

FA

cy0 = carry

s0

Binary Subtractor

Given two n-bit naturals x and y and an incoming borrow bit b0, an n-bit subtractor computes d ¼ x y b0. Thus d 0 (2n 1) 1 ¼ 2n and d (2n 1) 0 0 ¼ 2n 1, so that d is a signed integer belonging to the range 2n d 2n 1: The classical pencil and paper algorithm, adapted to the binary system, is used. At each step the difference xi yi bi is computed and expressed under the form xi yi bi ¼ di 2 biþ1 where di and biþ1 2 f0, 1g:

ð3:3Þ

If xi yi bi < 0 then di ¼ xi yi bi + 2 and bi+1 ¼ 1; if xi yi bi 0 then di ¼ xi yi bi and bi+1 ¼ 0. At the end of step n the result is obtained under the form d ¼ dn 2n þ dn1 2n1 þ d n2 2n2 þ . . . þ d 0 20

ð3:4Þ

where dn ¼ bn is the last borrow bit. This type of representation (3.4) in which the most significant bit dn has a negative weight 2n is the 2’s complement representation of the signed integer d. In this representation dn is the sign bit. Example 3.1 Compute (n ¼ 4) 0111 1001 1: xi: yi: bi: di:

0 1 1 1 1 0 0 1 1 0 0 1 1 ________________________________________________________________________________ 1 1 1 0 1

Conclusion: 0111 1001 1 ¼ 11101. In decimal: 7 9 1 ¼ 16 + 13 ¼ 3. By reducing both members of (3.3) modulo 2 the following relation is obtained: d i ¼ x i y i bi :

ð3:5Þ

On the other hand bi+1 ¼ 1 if, and only if, xi yi bi < 0, that is, when xi ¼ 0 and either yi or bi is equal to 1, or when both yi and bi are equal to 1. This condition can be expressed as follows: biþ1 ¼ xi yi þ xi bi þ yi bi : The following algorithm computes d.

ð3:6Þ

3.3

Binary Adder/Subtractor

71

Fig. 3.2 n-Bit subtractor

xn-1 yn-1

dn = bn

FS

dn-1

x1

bn-1

....

b2

y1

FS

d1

x0

b1

y0

FS

b0

d0

Algorithm 3.2 Binary Subtractor: d ¼ x y b0 for i in 0 to n-1 loop di ¼ xi xor yi xor bi; bi+1 ¼ (not(xi)and yi) or (not(xi) and bi) or (yi and bi); end loop; dn ¼ bn;

The circuit that implements Algorithm 3.2 (a for-loop) is shown in Fig. 3.2. It consists of n identical blocks called full subtractors (FS) that implement the loop body of Algorithm 3.2 (3.5 and 3.6).

3.3

Binary Adder/Subtractor

Given two n-bit naturals x and y and a 1-bit control input a/s, an n-bit adder/subtractor computes z ¼ [x + y] mod 2n if a/s ¼ 0 and z ¼ [x y] mod 2n if a/s ¼ 1. To compute z, define y as being the natural deduced from y by inverting all its bits: y ¼ yn1 yn2 . . . y0 , and check that y ¼ 1 yn1 2n1 þ 1 yn2 2n2 þ . . . þ ð1 y0 Þ 20 ¼ 2n 1 y: ð3:7Þ Thus z can be computed as follows: z ¼ ½x þ w þ a=s mod 2n , where w ¼ y if a=s ¼ 0 and w ¼ y if a=s ¼ 1:

ð3:8Þ

In other words wi ¼ a/s yi, 8 i ¼ 0 to n 1. Algorithm 3.3 Binary Adder/Subtractor for i in 0 to n-1 loop wi ¼ a/s xor yi; end loop; z ¼ (x + w + a/s) mod 2n;

The circuit that implements Algorithm 3.3 is shown in Fig. 3.3. It consists of an n-bit adder and n XOR2 gates. An additional XOR2 gate computes ovf (overflow): if a/s ¼ 0, ovf ¼ 1 if, and only if, cyn ¼ 1 and thus x + y 2n; if a/s ¼ 1, ovf ¼ 1 if, and only if, cyn ¼ 0 and thus x + (2n 1 y) + 1 < 2n, that is to say if x y < 0.

72

3 x

Arithmetic Blocks

y a/s yn-1

yn-2

y0 a/s

w a b n-bit adder: cyn c = (a+ b+ cy ) mod 2 cy0 0 n c

wn-1

wn-2

w0

z ovf

Fig. 3.3 n-Bit adder/subtractor

Fig. 3.4 Multiplication by yi (circuit and symbol)

yi xn-1

yi · xn-1

xn-2

yi

yi · xn-2

xn-3

yi · xn-3

yi·x

x

b. x0

yi · x0

a.

3.4

Binary Multiplier

Given an n-bit natural x and an m-bit natural y a multiplier computes p ¼ x y. The maximum value of p is (2n 1) (2m 1) < 2n+m so that p is an (n + m)-bit natural. If y ¼ ym1 2m1 þ ym2 2m2 þ . . . þ y1 2 þ y0 , then p ¼ x ym1 2m1 þ x ym2 2m2 þ . . . þ x y1 2 þ x y0 :

ð3:9Þ

The preceding expression can be computed as follows. First compute a set of partial products p0 ¼ x y0, p1 ¼ x y1 2, p2 ¼ x y2 22, . . . , pm1 ¼ x ym1 2m1, and then add the m partial products: p ¼ p0 + p1 + p2 + . . . + pm1. The computation of each partial product pi ¼ x yi 2i is very easy. The product x yi is computed by a set of AND2 gates (Fig. 3.4) and the multiplication by 2i amounts to adding i 0s to the right of the binary representation of x yi. For example, if i ¼ 5 and x y5 ¼ 10010110 then x y5 25 ¼ 1001011000000. The computation of p ¼ p0 þ p1 þ p2 þ . . . þ pm1 can be executed by a sequence of 2-operand additions: p ¼ ð. . . ððð0 þ p0 Þ þ p1 Þ þ p2 Þ . . .Þ þ pm1 . The following algorithm computes p.

3.4

Binary Multiplier

73 0

x

acc_in

b

Fig. 3.5 Binary multiplier

acc_in

b a

y0

a acc_out x·2

adder acc_in

b y1

a acc_out

acc_out

x·22 acc_in

b a

y2

acc_out ···· x·2m-1 acc_in

b a

ym-1

acc_out p

Algorithm 3.4 Binary Multiplier: p ¼ x y (Right to Left Algorithm) acc0 ¼ 0; for i in 0 to m-1 loop acci+1 ¼ acci + xyi2i; end loop; p ¼ accm;

Example 3.2 Compute (n ¼ 5, m ¼ 4) 11101 1011 (in decimal 29 11). The values of acci are the following: acc0: acc1: acc2: acc3: acc4:

0 0 0 0 1

0 0 0 0 0

0 0 1 1 0

0 0 0 0 1

0 1 1 1 1

0 1 0 0 1

0 1 1 1 1

0 0 1 1 1

0 1 1 1 1

Result: p ¼ 100111111 (in decimal 319). The circuit that implements Algorithm 3.4 (a for-loop) is shown in Fig. 3.5. It consists of m identical blocks that implement the loop body of Algorithm 3.4: accout ¼ accin þ b a where b ¼ x 2i and a ¼ yi : Comment 3.1 The building block of Fig. 3.5 computes accin + b a. At step i input b ¼ x 2i is an (n + i)-bit number. In particular at step m 1 it is an (n + m 1)-bit number. Thus, if all blocks are identical,

74

3

Arithmetic Blocks

acc_in (i+n-1··i) x n n

Fig. 3.6 Optimized block

yi

adder n

acc_in (i-1··0) i

acc_out (i+n··i+1) acc_out (i··0)

they must include an (n + m 1)-bit adder and n + m 1 AND2 gates. Nevertheless, for each i this building block can be optimized. At step number i it computes accout ¼ accin + x yi 2i where accin is an (i + n)-bit number (at each step one bit is added). On the other hand the rightmost i bits of x yi 2i are equal to 0. Thus accout ði þ n iÞ ¼ accin ði þ n 1 iÞ þ x yi ; accout ði 1 0Þ ¼ accin ði 1 0Þ: The corresponding optimized block is shown in Fig. 3.6. Each block contains an n-bit adder and n AND2 gates.

3.5

Binary Divider

Division is the more complex operation. Given two naturals x and y their quotient q ¼ x/y is usually not an integer. It is a so-called rational number. In many cases it is not even a fixed-point number. The desired accuracy must be taken into account. The quotient q with an accuracy of p fractional bits is defined by the following relation: x=y ¼ q þ e where q is a multiple of 2p and e < 2p :

ð3:10Þ

In other words q is a fixed-point number with p fractional bits q ¼ qm1 qm2 . . . q0 :q1 q2 . . . qp

ð3:11Þ

such that the error e ¼ x/y q is smaller than 2p. Most division algorithms work with naturals x and y such that x < y, so that q ¼ 0 :q1 q2 . . . qp :

ð3:12Þ

Consider the following sequence of integer divisions by y with x ¼ r0 < y: 2 r 0 ¼ q1 y þ r 1 with r 1 < y, 2 r 1 ¼ q2 y þ r 2 with r 2 < y, 2 r p 2 ¼ qpþ1 y þ r p 1 with r p 1 < y, 2 r p 1 ¼ qp y þ r p with r p < y:

ð3:13Þ

3.5

Binary Divider

75

At each step qi and ri are computed in function of ri1 and y so that the following relation holds true: 2 r i1 ¼ qi y þ r i :

ð3:14Þ

For that • Compute d ¼ 2 ri1 y. • If d < 0 then qi ¼ 0 and ri ¼ 2 ri1; else qi ¼ 1 and ri ¼ d.

Property 3.1 x=y ¼ 0:q1 q2 . . . qpþ1 qp þ r p =y 2p with r p =y 2p < 2p : Proof Multiply the first equation of (3.13) by 2p1, the second by 2p2, and so on. Thus 2p r 0 ¼ 2p1 q1 y þ 2p1 r 1 with r 1 < y, 2p1 r 1 ¼ 2p2 q2 y þ 2p2 r 2 with r 2 < y, 2 r p 2 2

¼ 2 qpþ1 y þ 2 r p 1 with r p 1 < y,

2 r p 1 ¼ qp y þ r p with r p < y: Then add up the p equations: 2p r 0 ¼ 2p1 q1 y þ 2p2 q2 y þ . . . þ 2 qpþ1 y þ qp y þ r p with r p < y; so that x ¼ 21 q1 þ 22 q2 þ . . . þ 2pþ1 qpþ1 þ 2p qp y þ r p 2p with r p 2p < y 2p ; and x=y ¼ 0:q1 q2 . . . qpþ1 qp þ r p =y 2p with r p =y 2p < 2p :

Example 3.3 Compute 21/35 with an accuracy of 6 bits: 2 2 2 2 2 2

21 ¼ 1 35 + 7 7 ¼ 0 35 + 14 14 ¼ 0 35 + 28 28 ¼ 1 35 + 21 21 ¼ 1 35 + 7 7 ¼ 0 35 + 14

Thus q ¼ 0.100110. In decimal: q ¼ 38/26. Error ¼ 21/35 38/26 ¼ 0.6 0.59375 ¼ 0.00625 < 26 ¼ 0.015625. The following algorithm computes q with an accuracy of p fractional bits.

76

3 y

x

Fig. 3.7 Binary divider

Arithmetic Blocks

r r q-1

q

q

r+

y 0

y sign

subtractor

2·r r q-2

y

1

0

q r+ r+ r

q-3

y

q r+ ····

r q-p

q

y r+ rp

Algorithm 3.5 Binary Divider: q ﬃ x/y, Error < 2p (Restoring Algorithm) r0 ¼ x; for i in 1 to p loop d ¼ 2ri-1 - y; if d < 0 then q-i ¼ 0; ri ¼ 2ri-1; else q-i ¼ 1; ri ¼ d; end if; end loop;

The circuit that implements Algorithm 3.5 (a for-loop) is shown in Fig. 3.7. It consists of p identical blocks that implement the loop body of Algorithm 3.5.

3.6

Exercises

1. An integer x can be represented under the form (1)s m where s is the sign of x and m is its magnitude (absolute value). Design an n-bit sign-magnitude adder/subtractor. 2. An incrementer-decrementer is a circuit with two n-bit inputs x and m, one binary control input up/ down, and one n-bit output z. If up/down ¼ 0, it computes z ¼ (x + 1) mod m, and if up/down ¼ 1, it computes z ¼ (x 1) mod m. Design an n-bit incrementer-decrementer. 3. Consider the circuit of Fig. 3.5 with the optimized block of Fig. 3.6. The n-bit adder of Fig. 3.6 can be implemented with n 1-bit adders (full adders). Define a 1-bit multiplier as being a component

References

77

with four binary inputs a, b, c, d, and two binary outputs e and f, that computes a b + c + d and expresses the result as 2 e + f (a 2-bit number). Design an n-bit-by-m-bit multiplier consisting of 1-bit multipliers. 4. Synthesize a 2n-bit-by-2n-bit multiplier using n-bit-by-n-bit multipliers and n-bit adders as components. 5. A mod m reducer is a circuit with two n-bit inputs x and m (m > 2) and one n-bit output z ¼ x mod m. Synthesize a mod m reducer.

References Deschamps JP, Gioul G, Sutter G (2006) Synthesis of arithmetic circuits. Wiley, New York Deschamps JP, Sutter G, Canto´ E (2012) Guide to FPGA implementation of arithmetic functions. Springer, Netherlands Ercegovac M, Lang T (2004) Digital arithmetic. Morgan Kaufmann Publishers, San Francisco Parhami B (2000) Computer arithmetic. Oxford University Press, Oxford

4

Sequential Circuits

The digital systems that have been defined and implemented in the preceding chapters are combinational circuits. If the component delays are not taken into account, that means that the value of their output signals only depends on the values of their input signals at the same time. However, many digital system specifications cannot be implemented by combinational circuits because the value of an output signal could be a function of not only the value of the input signals at the same time, but also the value of the input signals at preceding times.

4.1

Introductory Example

Consider the vehicle access control system of Fig. 4.1. It consists of • • • •

A gate that can be raised and lowered by a motor A push button to request the access Two sensors that detect two particular gate positions (upper and lower) A sensor that detects the presence of a vehicle within the gate area The motor control system has four binary input signals:

• • • •

Request equal to 1 when there is an entrance request (push button) Lower equal to 1 when the gate has been completely lowered Upper equal to 1 when the gate has been completely raised Vehicle equal to 1 if there is a vehicle within the gate area The binary output signals on/off and up/down control the motor:

• To raise the gate on/off ¼ 1 and up/down ¼ 1 • To lower the gate on/off ¼ 1 and up/down ¼ 0 • To maintain the gate open or closed on/off ¼ 0 The motor control system cannot be implemented by a combinational circuit. As an example, if at some time request ¼ 0, vehicle ¼ 0, upper ¼ 0, and lower ¼ 0, this set of input signal values could # Springer International Publishing Switzerland 2017 J.-P. Deschamps et al., Digital Systems, DOI 10.1007/978-3-319-41198-9_4

79

80

4

request

Sequential Circuits

motor ON/OFF

request vehicle upper

motor control

UP/DOWN

lower

gate area

Fig. 4.1 Vehicle access control

correspond to two different situations: (1) a vehicle is present in front of the gate, the request button has been pushed and released, and the gate is moving up, or (2) a vehicle has got in and the gate is moving down. In the first case on/off ¼ 1 and up/down ¼ 1; in the second case on/off ¼ 1 and up/ down ¼ 0. In conclusion, the values of the signals that control the motor depend on the following sequence of events: 1. 2. 3. 4. 5. 6.

Wait for request ¼ 1 (entrance request) Raise the gate Wait for upper ¼ 1 (gate completely open) Wait for vehicle ¼ 0 (gate area cleared) Lower the gate Wait for lower ¼ 1 (gate completely closed)

A new entrance request is not attended until this sequence of events is completed. Conclusion: Some type of memory is necessary in order to store the current step number (1–6) within the sequence of events.

4.2

Definition

Sequential circuits are digital systems with memory. They implement systems whose output signal values depend on the input signal values at times t (the current time), t 1, t 2, and so on (the precise meaning of t 1, t 2, etc. will be defined later). Two simple examples are sequence detectors and sequence generators. Example 4.1 (Sequence Detector) Implement a circuit (Fig. 4.2a) with a decimal input x and a binary output y. It generates an output value y ¼ 1 every time that the four latest inputted values were 1 5 5 7. It is described by the following instruction in which t stands for the current time: if x(t-3) ¼ 1 AND x(t-2) ¼ 5 AND x(t-2) ¼ 5 AND x(t) ¼ 7 then y ¼ 1; else y ¼ 0; end if;

4.2

Definition

Fig. 4.2 Sequence detector and sequence generator

81

x

sequence detector

y

a.

Fig. 4.3 Sequential circuit

sequence generator

y

b.

x0 … xn-1

… combinational circuit

…

q0 qm-1

y0 yk-1

…

memory

q 0Δ qm-1Δ

Thus, the corresponding circuit must store x(t 3), x(t 2), and x(t 1) and generates y in function of the stored values and of the current value of x. Example 4.2 (Sequence Generator) Implement a circuit (Fig. 4.2b) with a binary output y that continuously generates the output sequence 011011011011 . It is described by the following instruction in which t stands for the current time: if y(t-2) ¼ 1 AND y(t-1) ¼ 1 then y ¼ 0; else y ¼ 1; end if;

The corresponding circuit must store y(t 2) and y(t 1) and generates the current value of y in function of the stored values. Initially (t ¼ 0) the stored values y(2) and y(1) are equal to 1 so that the first output value of y is 0. The general structure of a sequential circuit is shown in Fig. 4.3. It consists of • A combinational circuit that implements k + m switching functions y0, y1, . . ., yk1, q0Δ, q1Δ, . . ., qm1Δ of n + m variables x0, x1, . . ., xn1, q0, q1, . . ., qm1 • A memory that stores an m-bit vector The combinational circuit inputs x0, x1, . . ., xn1 are inputs of the sequential circuit while (q0, q1, . . ., qm1) is an m-bit vector read from the memory. The combinational circuit outputs y0, y1, . . ., yk1 are outputs of the sequential circuit while (q0Δ, q1Δ, . . ., qm1Δ) is an m-bit vector written to the memory. The way the memory is implemented and the moments when the memory contents (q0, q1, . . ., qm1) are updated and replaced by (q0Δ, q1Δ, . . ., qm1Δ) will be defined later. With this structure, the output signals y0, y1, . . ., yk1 depend not only on the current value of the input signals x0, x1, . . ., xn1 but also on the memory contents q0, q1, . . ., qm1. The values of q0, q1, . . ., qm1 are updated at time . . . t1, t, t + 1, . . . with new values q0Δ, q1Δ, . . ., qm1Δ that are generated by the combinational circuit. The following terminology is commonly used: • x0, x1, . . ., xn1 are the external inputs • y0, y1, . . ., yk1 are the external outputs

82

4

Fig. 4.4 Sequence detector implementation

Sequential Circuits

x

y combinational circuit

q0Δ

q0 q1 q2

mem.

q1Δ q2Δ

• (q0, q1, . . ., qm1) is the internal state • (q0Δ, q1Δ, . . ., qm1Δ) is the next state To summarize, • The memory stores the internal state. • The combinational circuit computes the value of the external outputs and the next state in function of the external inputs and of the current internal state. • The internal state is updated at every time unit . . . t 1, t, t + 1, . . . by replacing q0 by q0Δ, q1 by q1Δ, and so on. Example 4.3 (Sequence Detector Implementation) The sequence detector of Example 4.1 can be implemented by the sequential circuit of Fig. 4.4 in which x, q0, q1, q2, q0Δ, q1Δ, and q2Δ are 4-bit vectors that represent decimal digits. The memory must store the three previous values of x that are q0 ¼ x(t 1), q1 ¼ x(t 2), and q2 ¼ x(t 3). For that q0 Δ ¼ x, q1 Δ ¼ q0 , q2 Δ ¼ q1 :

ð4:1Þ

y ¼ 1 if, and only if, q2 ¼ 1 AND q1 ¼ 5 AND q0 ¼ 5 AND x ¼ 7:

ð4:2Þ

The output y is defined as follows:

Equations 4.1 and 4.2 define the combinational circuit function. In the previous definitions and examples the concept of current time t is used but it has not been explicitly defined. To synchronize a sequential circuit and to give sense to the concept of current time and, in particular, to define the moments when the internal state is updated, a clock signal must be generated. It is a square wave signal (Fig. 4.5) with period T. The positive edges of this clock signal define the times that have been called . . . t 1, t, t + 1, . . . and are expressed in multiples of the clock signal period T. In particular the positive edges define the moments when the internal state is replaced by the next state. In Fig. 4.5 some commonly used terms are defined. • • • • • •

Positive edge: a transition of the clock signal from 0 to 1 Negative edge: a transition of the clock signal from 1 to 0 Cycle: section of a clock signal that corresponds to one period Frequency: the number of cycles per second (1/T ) Positive pulse: the part of a clock signal cycle where clock ¼ 1 Negative pulse: the part of a clock signal cycle where clock ¼ 0

4.3

Explicit Functional Description

83 Negative edge

Positive edge cycle

Positive pulse

clock Period=T Frequency=1/T

Negative pulse

Fig. 4.5 Clock signal

Comment 4.1 Instead of using the positive edges of the clock signal to synchronize the circuit operations, the negative edges could be used. This is an essential part of the specification of a sequential circuit: positive edge triggered or negative edge triggered.

4.3

Explicit Functional Description

Explicit functional descriptions of combinational circuits (Sect. 1.2.1) are tables that define the output signal values associated to all possible combinations of input signal values. In the case of sequential circuits all possible internal states must also be considered: to different internal states correspond different relations between input and output signals.

4.3.1

State Transition Graph

A state transition graph consists of a set of vertices that correspond to the internal states and of a set of directed edges that define the internal state transitions and the output signal values in function of the input signal values. Example 4.4 The graph of Fig. 4.6b defines a sequential circuit (Fig. 4.6a) that has three internal states A, B, and C encoded with two binary variables q0 and q1 that are stored in the memory block; it has a binary input signal x and a binary output signal y. It works as follows: • • • • •

If the internal state is A and if x ¼ 0 then the next state is C and y ¼ 0. If the internal state is A and if x ¼ 1 then the next state is A and y ¼ 1. If the internal state is B then (whatever x) the next state is A and y ¼ 0. If the internal state is C and if x ¼ 0 then the next state is C and y ¼ 0. If the internal state is C and if x ¼ 1 then the next state is B and y ¼ 1.

To complete the combinational circuit (Fig. 4.6a) specification it remains to choose the encoding of states A, B, and C, for example: A : q0 q1 ¼ 00, B : q0 q1 ¼ 01, C : q0 q1 ¼ 10: The following case instruction defines the combinational circuit function:

ð4:3Þ

84

4

Sequential Circuits

x = 1; y = 1

A x

y

combinational circuit

x = 0 or 1; y = 0 …

x = 0; y = 0

B q0

q0D

memory

C

q1D

q1 a.

x = 1; y = 1

b.

x = 0; y = 0

Fig. 4.6 Example of state transition graph (Mealy model) x=1

A x0 x1

combinational circuit

y=1

y

x = 0, 1 or 2

…

x=0

B y=0

x=2

q0 q1

memory

q0D q1D

a.

C y=1

x = 1 or 2

x=0

b.

Fig. 4.7 Example of state transition graph (Moore model)

case q0q1 is when 00 ¼> if x ¼ 0 then q0Δq1Δ ¼ 10; y ¼ 0; else q0Δq1Δ ¼ 00; y ¼ 1; end if; when 01 ¼> q0Δq1Δ ¼ 00; y ¼ 0; when 10 ¼> if x ¼ 0 then q0Δq1Δ ¼ 10; y ¼ 0; else q0Δq1Δ ¼ 01; y ¼ 1; end if; when others ¼> q0Δq1Δ ¼ don’t care; y ¼ don’t care; end case;

The clock signal (Sect. 4.2) is not represented in Fig. 4.6a but it is implicitly present and it is responsible for the periodic updating of the internal state. The way that the external output signal values are defined in Fig. 4.6b corresponds to the so-called Mealy model: the value of y depends on the current internal state and on the current value of the input signal x. In the following example another method is used. Example 4.5 The graph of Fig. 4.7b. defines a sequential circuit (Fig. 4.7a) that has three internal states A, B, and C encoded with two binary variables q0 and q1 that are stored in the memory block; it has two binary input signals x0 and x1 that encode a ternary digit x 2 {0, 1, 2} and a binary output signal y. It works as follows: • If the internal state is A and if x ¼ 0 then the next state is C and y ¼ 1. • If the internal state is A and if x ¼ 1 then the next state is A and y ¼ 1.

4.3

• • • •

Explicit Functional Description

85

If the internal state is A and if x ¼ 2 then the next state is B and y ¼ 1. If the internal state is B then (whatever x) the next state is A and y ¼ 0. If the internal state is C and if x ¼ 0 then the next state is C and y ¼ 1. If the internal state is C and if x ¼ 1 or 2 then the next state is B and y ¼ 1.

To complete the combinational circuit (Fig. 4.7a) specification it remains to choose the encoding of states A, B, and C, for example the same as before (4.3). The following case instruction defines the combinational circuit function: case q0q1 is when 00 ¼> if x1x0 ¼ 00 then q0Δq1Δ ¼ 10; elsif x1x0 ¼ 01 then q0Δq1Δ ¼ 00; elsif x1x0 ¼ 10 then q0Δq1Δ ¼ 01; else q0Δq1Δ ¼ don’t care; end if; y ¼ 1; when 01 ¼> if x1x0 ¼ 11 then q0Δq1Δ ¼ don’t care; else q0Δq1Δ ¼ 00; end if; y ¼ 0; when 10 ¼> if x1x0 ¼ 00 then q0Δq1Δ ¼ 10; elsif (x1x0 ¼ 01) or (x1x0 ¼ 10) then q0Δq1Δ ¼ 01; else q0Δq1Δ ¼ don’t care; end if; y ¼ 1; when others ¼> q0Δq1Δ ¼ don’t care; y ¼ don’t care; end case;

In this case, the value of y only depends on the current internal state. It is the so-called Moore model: the value of y only depends on the current internal state; it does not depend on the current value of the input signals x0 and x1. To summarize, two graphical description methods have been described. In both cases it is a graph whose vertices correspond to the internal states of the sequential circuit and whose directed edges are labelled with the input signal values that cause the transition from a state to another. They differ in the way that the external output signals are defined. • In the first case (Example 4.4) the external output values are a function of the internal state and of the external input values. The directed edges are labelled with both the input signal values that cause the transition and with the corresponding output signal values. It is the Mealy model. • In the second case (Example 4.5) the external output values are a function of the internal state. The directed edges are labelled with the input signal values that cause the transition and the vertices with the corresponding output signal values. It is the Moore model. Observe that a Moore model is a particular case of Mealy model in which all edges whose origin is the same internal state are labelled with the same output signal values. As an example the graph of Fig. 4.8 describes the same sequential circuit as the graph of Fig. 4.7b. Conversely it can be demonstrated that a sequential circuit defined by a Mealy model can also be defined by a Moore model but, generally, with more internal states.

86

4

Fig. 4.8 Mealy model of Fig. 4.7b

Sequential Circuits x = 1; y = 1

A x = 0, 1 or 2; y = 0 x = 0; y = 1 x = 2; y=1

C

B

x = 1 or 2; y = 1

x = 0; y = 1

Fig. 4.9 Photo of a robot vacuum cleaner (courtesy of iRobot Corporation)

4.3.2

Example of Explicit Description Generation

Given a functional specification of a sequential circuit, for example in a natural language, how can a state transition graph be defined? There is obviously no systematic and universal method to translate an informal specification to a state transition graph. It is mainly a matter of common sense and imagination. As an example, consider the circuit that controls a robot vacuum cleaner (the photo of a commercial robot is shown in Fig. 4.9). To make the example more tractable a simplified version of the robot is defined: • The robot includes a sensor that generates a binary signal OB ¼ 1 when it detects an obstacle in front of it. • The robot can execute three orders under the control of two binary inputs LR (left rotate) and RR (right rotate): move forward (LR ¼ RR ¼ 0), turn 90 to the left (LR ¼ 1, RR ¼ 0), and turn 90 to the right (LR ¼ 0, RR ¼ 1). The specification of the robot control circuit is the following: • • • •

If there is no obstacle: move forward. When an obstacle is detected: turn to the right until there is no more obstacle. The next time an obstacle is detected: turn to the left until there is no more obstacle. The next time an obstacle is detected: turn to the right until there is no more obstacle, and so on.

This behavior cannot be implemented by a combinational circuit. In order to take a decision it is not enough to know whether there is an obstacle or not; it is necessary to know the latest ordered movements:

4.3

Explicit Functional Description

87

OB = 0

OB

combinational circuit

OB = 1

RR RL

SAR RR RL = 00

OB = 1

OB = 1

OB = 0

SRL RR RL = 01

…

SRR RR RL = 10 OB = 0

q0 q1

memory

OB = 1

q0D

SAL RR RL = 00

q1D

a.

OB = 0

b.

Fig. 4.10 Robot control circuit

• If the previous command was turn to the right and if there is no obstacle then move forward. • If the previous command was turn to the right and there is still an obstacle then keep turning to the right. • If the previous command was turn to the left and if there is no obstacle then move forward. • If the previous command was turn to the left and there is still an obstacle then keep turning to the left. • If the previous command was move forward and if there is no obstacle then keep moving forward. • If the previous command was move forward and there is an obstacle and the latest rotation was to the left then turn to the right. • If the previous command was move forward and there is an obstacle and the latest rotation was to the right then turn to the left. This analysis suggests the definition of four internal states: • • • •

SAL: The robot is moving forward and the latest rotation was to the left. SAR: The robot is moving forward and the latest rotation was to the right. SRR: The robot is turning to the right. SRL: The robot is turning to the left.

With those internal states the behavior of the robot control circuit is defined by the state transition graph of Fig. 4.10b (Moore model). To define the combinational circuit of Fig. 4.10a the internal states of Fig. 4.10b must be encoded. For example: SAR : q0 q1 ¼ 00, SRR : q0 q1 ¼ 01, SAL : q0 q1 ¼ 10, SRL : q0 q1 ¼ 11: The following case instruction defines the combinational circuit function: case q0q1 is when 00 ¼> if OB ¼ 0 then q0Δq1Δ ¼ 00; else q0Δq1Δ ¼ 11; end if; RR ¼ 0; RL ¼ 0; when 01 ¼> if OB ¼ 0 then q0Δq1Δ ¼ 00; else q0Δq1Δ ¼ 01; end if; RR ¼ 1; RL ¼ 0;

ð4:4Þ

88

4

Table 4.1 Robot control circuit: next state table

Current state SAR SAR SRR SRR SAL SAL SRL SRL

Table 4.2 Robot control circuit: output table

Input: OB 0 1 0 1 0 1 0 1

Current state SAR SRR SAL SRL

Sequential Circuits

Next state SAR SRL SAR SRR SAL SRR SAL SRL

Outputs: RR RL 00 10 00 01

when 10 ¼> if OB ¼ 0 then q0Δq1Δ ¼ 10; else q0Δq1Δ ¼ 01; end if; RR ¼ 0; RL ¼ 0; when 11 ¼> if OB ¼ 0 then q0Δq1Δ ¼ 10; else q0Δq1Δ ¼ 11; end if; RR ¼ 0; RL ¼ 1; end case;

4.3.3

Next State Table and Output Table

Instead of defining the behavior of a sequential circuit with a state transition graph, another option is to use tables. Once the set of internal states is known, the specification of the circuit of Fig. 4.3 amounts to the specification of the combinational circuit, for example by means of two tables: • A table (next state table) that defines the next internal state in function of the current state and of the external input values • A table (output table) that defines the external output values in function of the current internal state (Moore model) or in function of the current internal state and of the external input values (Mealy model) As an example, the state transition diagram of Fig. 4.10b can be described by Tables 4.1 and 4.2.

4.4

Bistable Components

Bistable components such as latches and flip-flops are basic building blocks of any sequential circuit. They are used to implement the memory block of Fig. 4.3 and to synchronize the circuit operations with an external clock signal (Sect. 4.2).

4.4

Bistable Components

4.4.1

89

1-Bit Memory

A simple 1-bit memory is shown in Fig. 4.11a. It consists of two interconnected inverters. This circuit has two stable states. In Fig. 4.11b the first inverter input is equal to 0 so that the second inverter input is equal to 1 and its output is equal to 0. Thus this is a stable state. Similarly another stable state is shown in Fig. 4.11c. This circuit has the capacity to store a 1-bit data. It remains to define the way a particular stable state can be defined. To control the state of the 1-bit memory, the circuit of Fig. 4.11a is completed with two tristate buffers controlled by an external Load signal (Fig. 4.12a): • If Load ¼ 1 then the circuit of Fig. 4.12a is equivalent to the circuit of Fig. 4.12b: the input D value (0 or 1) is transmitted to the first inverter input, so that the output P is equal to NOT(D) and Q ¼ NOT(P) ¼ D; on the other hand the output of the second inverter is disconnected from the first inverter input (buffer 2 in state Z, Sect. 2.4.3). • If Load ¼ 0 then the circuit of Fig. 4.12a is equivalent to the circuits of Fig. 4.12c and of Fig. 4.11a; thus it has two stable states; the value of Q is equal to the value of D just before the transition of signal Load from 1 to 0. Observe that the two tristate buffers of Fig. 4.12a implement the same function as a 1-bit MUX2-1. The circuit of Fig. 4.12a is a D-type latch. It has two inputs: a data input D and a control input Load (sometimes called Enable). It has two outputs: Q and P ¼ Q. Its symbol is shown in Fig. 4.13. Its working can be summarized as follows: when Load ¼ 1 the value of D is sampled, and when Load ¼ 0 this sampled value remains internally stored.

Fig. 4.11 1-Bit memory

0

1

2

2

a.

D

1

b1

b2

Load

P

D

1

P

Q a.

1 0

1

2

b.

2

2

1

1

c.

P

1

2 Q

b.

Q c.

Fig. 4.12 D-type latch

Fig. 4.13 D-type latch symbol

D

Q

Load

Q

90

4

Sequential Circuits

Formally, a D-type latch could be defined as a sequential circuit with an external input D, an external output Q (plus an additional output Q), and two internal states S0 and S1. The next state table and the output table are shown in Table 4.3. However this circuit is not synchronized by an external clock signal; it is a so-called asynchronous sequential circuit. In fact, the external input Load could be considered as a clock signal input: the value of D is read and stored on each 1–0 (falling edge) of the Load signal so that the working of a D-type latch could be described by the equation QΔ ¼ D. Nevertheless, when Load ¼ 1 then Q ¼ D (transparent state) and any change of D immediately causes the same change on Q, without any type of external synchronization. Another way to control the internal state of the 1-bit memory of Fig. 4.11a is to replace the inverters by 2-input NAND or NOT gates. As an example, the circuit of Fig. 4.14a is an SR latch. It works as follows: • If S ¼ R ¼ 0 then both NOR gates are equivalent to inverters (Fig. 4.14b) and the circuit of Fig. 4.14a is equivalent to a 1-bit memory (Fig. 4.11a). • If S ¼ 1 and R ¼ 0 then the output of the first NOR is equal to 0, whatever the other input value, and the second NOR is equivalent to an inverter (Fig. 4.14c); thus Q ¼ 1. • If S ¼ 0 and R ¼ 1 then the output of the second NOR is equal to 0, whatever the other input value, and the first NOR is equivalent to an inverter (Fig. 4.14d); thus Q ¼ 0. To summarize, with S ¼ 1 and R ¼ 0 the latch is set to 1; with S ¼ 0 and R ¼ 1 the latch is reset to 0; with S ¼ R ¼ 0 the latch stores the latest written value. The combination S ¼ R ¼ 1 is not used (not allowed). The symbol of an SR latch is shown in Fig. 4.15. Table 4.3 Next state table and output table of a D-type latch Load 0 1 1 0 1 1

Current state S0 S0 S0 S1 S1 S1

S

R

a.

D – 0 1 – 0 1

Q

Q

Q

Q

b.

Next state S0 S0 S1 S1 S0 S1

Output 0 0 0 1 1 1

1

0

1

c.

0

d.

Fig. 4.14 SR latch

Fig. 4.15 Symbol of an SR latch

S

Q

R

Q

4.4

Bistable Components

91

An SR latch is an asynchronous sequential circuit. Its state can only change on a rising edge of either S or R, and the new state is defined by the following equation: QΔ ¼ S þ R Q.

4.4.2

Latches and Flip-Flops

Consider again the sequential circuit of Fig. 4.3. The following question has not yet been answered: How the memory block is implemented? It has two functions: it stores the internal state and it synchronizes the operations by periodically updating the internal state under the control of a clock signal. In Sect. 4.4.1 a 1-bit memory component has been described, namely the D-type latch. A first option is shown in Fig. 4.16. In Fig. 4.16 the memory block is made up of m D-type latches. The clock signal is used to periodically load new values within this m-bit memory. However, this circuit would generally not work correctly. The problem is that when clock ¼ 1 all latches are in transparent mode so that after a rising edge of clock the new values of q0, q1, . . ., qm1 could modify the values of q0Δ, q1Δ, . . ., qm1Δ before clock goes back to 0. To work correctly the clock pulses should be shorter than the minimum propagation time of the combinational circuit. For that reason another type of 1-bit memory element has been developed. A D-type flip-flop is a 1-bit memory element whose state can only change on a positive edge of its clock input. A possible implementation and its symbol are shown in Fig. 4.17. It consists of two D-type latches controlled by clock and NOT(clock), respectively, so that they are never in transparent Fig. 4.16 Memory block implemented with latches

x0 … xn-1

combinational circuit

q0

… y0 yk-1 …

Q

D

q0Δ

Q Load … qm-1

Q

D

qm-1Δ

Q Load clock

Fig. 4.17 D-type flip-flop d

D

Q

Load

Q

q1

D

Q

q

d

Load

Q

q

clock

D

Q

q

Q

q

b. clock

a.

92

4

Sequential Circuits

mode at the same time. When clock ¼ 0 the first latch is in transparent mode so that q1 ¼ d and the second latch stores the latest read value of q1. When clock ¼ 1 the first latch stores the latest read value of d and the second latch is in transparent mode so that q ¼ q1. Thus, the state q of the second latch is updated on the positive edge of clock. Example 4.6 Compare the circuits of Fig. 4.18a, c: with the same input signals Load and D (Fig. 4.18b, d) the output signals Q are different. In the first case (Fig. 4.18b), the latch transmits the value of D to Q as long as Load ¼ 1. In the second case (Fig. 4.18d) the flip-flop transmits the value of D to Q on the positive edges of Load. Flip-flops need more transistors than latches. As an example the flip-flop of Fig. 4.17 contains two latches. But circuits using flip-flops are much more reliable: the circuit of Fig. 4.19 works correctly even if the clock pulses are much longer than the combinational circuit propagation time; the only Fig. 4.18 Latch vs. flipflop

Load D Load

D

Q

Q

Load Q

a. b.

Load D

D

Q

Load

Q

Q

c. d. Fig. 4.19 Memory block implemented with flip-flops

x0 … xn-1

combinational circuit

q0

… y0 yk-1 …

Q

D

q0Δ

Q … qm-1

Q

D

qm-1Δ

Q clock

4.5

Synthesis Method

93

Fig. 4.20 D-type flip-flop with asynchronous inputs set and reset

reset

D

set

set Q

clock clock

Q

reset

D

a.

Q

b.

timing condition is that the clock period must be greater than the combinational circuit propagation time. For that reason flip-flops are the memory components that are used to implement the memory block of sequential circuits. Comment 4.2 D-type flip-flops can be defined as synchronized sequential circuits whose equation is QΔ ¼ D:

ð4:5aÞ

Other types of flip-flops have been developed: SR flip-flop, JK flip-flop, and T flip-flop. Their equations are QΔ ¼ S þ R Q;

ð4:5bÞ

QΔ ¼ J Q þ K Q;

ð4:5cÞ

QΔ ¼ T Q þ T Q:

ð4:5dÞ

Flip-flops are synchronous sequential circuits. Thus, (4.5a–4.5d) define the new internal state QΔ that will substitute the current value of Q on an active edge (positive or negative depending on the flip-flop type) of clock. Inputs D, S, R, J, K, and T are sometimes called synchronous inputs because their values are only taken into account on active edges of clock. Some components also have asynchronous inputs. The symbol of a D-type flip-flop with asynchronous inputs set and reset is shown in Fig. 4.20a. As long as set ¼ reset ¼ 0, it works as a synchronous circuit so that its state Q only changes on an active edge of clock according to (4.5a). However, if at some moment set ¼ 1 then, independently of the values of clock and D, Q is immediately set to 1, and if at some moment reset ¼ 1 then, independently of the values of clock and D, Q is immediately reset to 0. An example of chronogram is shown in Fig. 4.20b. Observe that the asynchronous inputs have an immediate effect on Q and have priority with respect to clock and D.

4.5

Synthesis Method

All the concepts necessary to synthesize a sequential circuit have been studied in the preceding sections. The starting point is a state transition graph or equivalent next state table and output table. Consider again the robot control system of Sect. 4.3.2. It has four internal states SAR, SRR, SRL, and SAL and it is described by the state transition graph of Fig. 4.10b or by a next state table (Table 4.1) and an output table (Table 4.2). The output signal values are defined according to the Moore model.

94 Fig. 4.21 Robot control circuit with D-type flipflops

4

OB

Sequential Circuits RR RL

combinational circuit

q0

Q

D

q0D

Q

q1

Q

D

q1D

Q clock

Table 4.4 Next state functions q1D and q0D

Table 4.5 External output functions RR and RL

Current state q1 q0 00 00 01 01 10 10 11 11

Current state q1 q0 00 01 10 11

Input: OB 0 1 0 1 0 1 0 1

Next state q1Δ q0Δ 00 11 00 01 10 01 10 11

Outputs: RR RL 00 10 00 01

The general circuit structure is shown in Fig. 4.3. In this example there is an external input OB (n ¼ 1) and two external outputs RR and RL (k ¼ 2), and the four internal states can be encoded with two variables q0 and q1 (m ¼ 2). Thus, the circuit of Fig. 4.10a is obtained. To complete the design, a first operation is to choose an encoding of the four states SAR, SRR, SRL, and SAL with two binary variables q0 and q1. Use for example the encoding of (4.4). Another decision to be taken is the structure of the memory block. This point has already been analyzed in Sect. 4.4.2. The conclusion was that the more reliable option is to use flip-flops, for example D-type flip-flops (Fig. 4.17). The circuit that implements the robot control circuit is shown in Fig. 4.21 (Fig. 4.19 with n ¼ 1, k ¼ 2, and m ¼ 2). To complete the sequential circuit design it remains to define the functions implemented by the combinational circuit. From Tables 4.1 and 4.2, from the chosen internal state encoding (4.4), and from the D-type flip-flop specification (4.5a), Tables 4.4 and 4.5 that define the combinational circuit are deduced. The implementation of a combinational circuit defined by truth tables has been studied in Chap. 2. The equations that correspond to Tables 4.4 and 4.5 are the following:

4.5

Synthesis Method

Fig. 4.22 Robot control circuit implemented with logic gates and D-type flipflops

95

RR OB

RL

q0

Q

Q

q1

D

reset

Q

Q

q0Δ

D

q1Δ

reset clock reset

D1 ¼ q1 Δ ¼ q1 : q0 : OB þ q1 : OB þ q1 : q0 , D0 ¼ q0 Δ ¼ OB, RR ¼ q1 :q0 , RL ¼ q1 :q0 : The corresponding circuit is shown in Fig. 4.22. Comments 4.3 • Flip-flops generally have two outputs Q and Q so that the internal state variables yi are available under normal and under inverted form, and no additional inverters are necessary. • An external asynchronous reset has been added. It defines the initial internal state (q1q0 ¼ 00, that is, state SAR). In many applications it is necessary to set the circuit to a known initial state. Furthermore, to test the working of a sequential circuit it is essential to know its initial state. As a second example consider the state transition graph of Fig. 4.23, in this case a Mealy model. Its next state and output tables are shown in Table 4.6. The three internal states can be encoded with two internal state variables q1 and q0, for example S0 : q1 q0 ¼ 00, S1 : q1 q0 ¼ 01, S2 : q1 q0 ¼ 10:

ð4:6Þ

The circuit structure is shown in Fig. 4.24. According to Tables 4.5 and 4.6, the combinational circuit is defined by Table 4.7. The equations that correspond to Table 4.7 are the following: q1Δ ¼ q0 þ a, q0Δ ¼ q1 q0 a, z ¼ q1 a:

96

4

Fig. 4.23 Sequential circuit defined by a state transition graph (Mealy model)

Sequential Circuits

a = 0; z = 0

S0

a = 1; z = 0 a = 1; z = 1

S1 a = 0 or 1; z = 0

S2 a = 0; z = 0

Table 4.6 Next state and output tables (Mealy model)

Current state S0 S0 S1 S1 S2 S2

a 0 1 0 1 0 1

z 0 0 0 0 0 1

Next state S0 S1 S2 S2 S2 S0

Fig. 4.24 Circuit structure a

z

combinational circuit

q0

Q

D

q0Δ

Q

q1

Q

D

Δ

q1 Q

clock

Table 4.7 Combinational circuit: truth table

4.6

Current state q1 q0 00 00 01 01 10 10 11 11

Input: a 0 1 0 1 0 1 0 1

Next state q1Δ q0Δ 00 01 10 10 10 00 – –

Output: z 0 0 0 0 0 1 – –

Sequential Components

This section deals with particular sequential circuits that are building blocks of larger circuits, namely registers, counters, and memory blocks.

4.6

Sequential Components

4.6.1

97

Registers

An n-bit register is a set of n D-type flip-flops or latches controlled by the same clock signal. They are used to store n-bit data. A register made up of n D-type flip-flops with asynchronous reset is shown in Fig. 4.25a and the corresponding symbol in Fig. 4.25b. This parallel register is a sequential circuit with 2n states encoded by an n-bit vector q ¼ qn1 qn2 . . . q0 and defined by the following equations: qΔ ¼ IN, OUT ¼ q:

ð4:7Þ

Some registers have an additional OE (Output Enable) asynchronous control input (Fig. 4.26). This permits to connect the register output to a bus without additional tristate buffers (they are included in the register). In this example the control input is active when it is equal to 0 (active-low input, Fig. 2.29). For that reason it is called OE instead of OE. Figures 4.25 and 4.26 are two examples of parallel registers. Other registers can be defined. For example: a set of n latches controlled by a load signal, instead of flip-flops, in which case OUT ¼ IN (transparent state) when load ¼ 1. Thus they should not be used within feedback loops to avoid INn-1

INn-2 D

set

Q

IN0 D

set

Q

D

set

Q reset

··· reset

Q

reset

Q

reset

clock reset

Q

IN n-bit register OUT

··· ···

OUTn-1

b.

OUTn-2

OUT0

a.

Fig. 4.25 n-Bit register INn-1

INn-2

D

set

Q

IN0

D

set

Q

D

set

Q reset

··· resetQ

resetQ

resetQ

IN n-bit register OUT

OE clock reset

··· ··· OUT

OE

BUS b.

OUTn-1

OUTn-2

OUT0 ···

a.

Fig. 4.26 n-Bit register with output enable (OE)

BUS

98

4

serial_in

D

set

D

Q

set

D

Q

set

Q

serial_in reset n-bit shift register OUT

··· reset

Q

reset

Q

reset

Q

··· ···

clock reset

OUTn-1

Sequential Circuits

b.

OUT0

OUTn-2 a.

Fig. 4.27 Shift register

Fig. 4.28 Division and multiplication by 2

0 reset clock

serial_in reset n-bit shift register

serial_in n-bit shift register reset

OUT

OUT

OUT = q

OUT = q

a.

b.

0 reset clock

unstable states. Other examples of optional configurations: clock active-low or active-high, asynchronous set or reset, and OE active-low or active-high. Shift registers are another type of commonly used sequential components. As parallel registers they consist of a set of n D-type flip-flops controlled by the same clock signal, so that they store an nbit vector, but furthermore they can shift the stored data by one position to the right (to the left) at each clock pulse. An example of shift register is shown in Fig. 4.27. It has a serial input serial_in and a parallel output OUT. At each clock pulse a new bit is inputted, the stored word is shifted by one position to the right, and the last (least significant) bit is lost. This shift register is a sequential circuit with 2n states encoded by an n-bit vector q ¼ qn-1 qn-2 . . . q0 and defined by the following equations: qn 1 Δ ¼ serial in, qi Δ ¼ qiþ1 8i ¼ 0 to n 2, OUT ¼ q:

ð4:8Þ

Shift registers have several applications. For example, assume that the current state q represents an nbit natural. Then a shift to the right with serial_in ¼ 0 (Fig. 4.28a) amounts to the integer division of q by 2, and a shift to the left with serial_in ¼ 0 (Fig. 4.28b) amounts to the multiplication of q by 2 mod 2n. For example, if n ¼ 8 and the current state q is 10010111 (151 in decimal) then after a shift to the right q ¼ 01001011 (75 ¼ b151=2c in decimal) and after a shift to the left q ¼ 00101110 (46 ¼ 302 mod 256 in decimal). There are several types of shift registers that can be classified according to (among others) • The shift direction: shift to the left, shift to the right, bidirectional shift, cyclic to the left, cyclic to the right • The input type: serial or parallel input • The output type: serial or parallel output In Fig. 4.29 a 4-bit bidirectional shift register with serial input and parallel output is shown. When L/R ¼ 0 the stored word is shifted to the left and when L/R ¼ 1 the stored word is shifted to the right.

4.6

Sequential Components IN

99

L/R

0 D

set

0 D

Q

1

set

0 D

Q

1 reset

set

0 D

Q

1

Q

reset

set

Q

1

Q

reset

Q

reset

Q

clock reset

OUT1

OUT2

OUT3

OUT0

Fig. 4.29 4-Bit bidirectional shift register with serial input and parallel output

IN3

IN2

IN1

IN0

L/S

D

set

0 D

Q

set

0 Q

D

1 reset

Q

set

0 Q

D

1 reset

Q

set

Q

1 reset

Q

reset

Q

clock reset

OUT3

OUT2

OUT1

OUT0

Fig. 4.30 4-Bit shift register with serial and parallel input and with parallel output

This shift register is a sequential circuit with 16 states encoded by a 4-bit vector q ¼ q3 q2 q1 q0 and defined by the following equations: q3Δ ¼ L=R q2 þ L=R IN, qi Δ ¼ L=R qi1 þ L=R qiþ1 8i ¼ 1 or 2, q0 Δ ¼ L=R IN þ L=R q1 , OUT ¼ q:

ð4:9Þ

Another example is shown in Fig. 4.30: a 4-bit shift register, with serial input IN3, parallel input IN ¼ (IN3, IN2, IN1, IN0), and parallel output OUT. When L/S ¼ 0 the parallel input value is loaded within the register, and when L/S ¼ 1 the stored word is shifted to the right and IN3 is stored within the most significant register bit. This shift register is a sequential circuit with 16 states encoded by a 4-bit vector q ¼ q3 q2 q1 q0 and defined by the following equations: q3 Δ ¼ IN 3 , qi Δ ¼ L=S IN i þ L=S qiþ1 8i ¼ 0, 1, 2, OUT ¼ q:

ð4:10Þ

Shift registers with other control inputs can be defined. For example: • PL (parallel load): When active, input bits INn1, INn2, . . ., IN0 are immediately loaded in parallel, independently of the clock signal (asynchronous load). • CE (clock enable): When active the clock signal is enabled; when nonactive the clock signal is disabled and, in particular, there is no shift. • OE (output enable): When equal to 0 all output buffers are enabled and when equal to 1 all output buffers are in high impedance (state Z, disconnected).

100

4

Fig. 4.31 Symbol of a shift register with control inputs PL or L/S, CE, and OE

inn-1inn-2 reset PL or L/S CE clock

in1

....

Sequential Circuits in0 serial_out

shift register

....

OE

outn-1 outn-2

....

out1 out0

register 1, parallel in, serial out

origin

…

TRANSMISSION CHANNEL

…

destination

register 2, serial in, parallel out

Fig. 4.32 Parallel-to-serial and serial-to-parallel conversion

Figure 4.31 is the symbol of a shift register with parallel input in, serial output serial_out and parallel output out, negative output enable control signal OE (to enable the parallel output), clock enable control signal (CE), and asynchronous reset input. With regard to the load and shift operations two options are considered: (1) with PL (asynchronous load): a new data is immediately stored when PL ¼ 1 and the stored data is synchronously shifted to the right on a clock pulse when CE ¼ 1; (2) with L/S (synchronous load): a new data is stored on a clock pulse when L/S ¼ 0 and CE ¼ 1, and the stored data is synchronously shifted to the right on a clock pulse when L/S ¼ 1 and CE ¼ 1. Apart from arithmetic operations (multiplication and division by 2) shift registers are used in other types of applications. One of them is the parallel-to-serial and serial-to-parallel conversion in data transmission systems. Assume that a system called “origin” must send n-bit data to another system called “destination” using for that a 1-bit transmission channel (Fig. 4.32). The solution is a parallelin serial-out shift register on the origin side and a serial-in parallel-out shift register on the destination side. To transmit a data, it is first loaded within register 1 (parallel input); then it is serially shifted out of the register 1 (serial output), it is transmitted on the 1-bit transmission channel, and it is shifted into register 2 (serial input); when all n bits have been transmitted the transmitted data is read from register 2 (parallel output). Another application of shift registers is the recognition of sequences. Consider a sequential circuit with a 1-bit input in and a 1-bit output out. It receives a continuous string of bits and must generate an output out ¼ 1 every time that the six latest received bits in(t) in(t 1) in(t 2) in(t 3) in(t 4) in(t 5) are 100101. A solution is shown in Fig. 4.33: a serial-in parallel-out shift register that stores the five values in(t 1) in(t 2) in(t 3) in(t 4) in(t 5) and generates out ¼ 1 when in(t) in (t 1) in(t 2) in(t 3) in(t 4) in(t 5) ¼ 100101. Another example is shown in Fig. 4.34. This circuit has an octal input in ¼ in2 in1 in0 and a 1-bit output out. It receives a continuous string of digits and must generate an output out ¼ 1 every time

4.6

Sequential Components

in(t) clock

D Q Q

in(t-1)

101

D Q Q

in(t-2)

D Q Q

in(t-3)

D Q Q

in(t-4)

D Q Q

in(t-5)

out

Fig. 4.33 Detection of sequence 100101

Fig. 4.34 Detection of sequence 1026

in2

D Q Q

D Q Q

D Q Q

in1

D Q Q

D Q Q

D Q Q out

in0

D Q Q

D Q Q

D Q Q

clock

that the four latest received digits in(t) in(t 1) in(t 2) in(t 3) are 1026 (in binary 001 000 010 110). The circuit of Fig. 4.34 consists of three 3-bit shift registers that detect the 1-bit sequences in2 ¼ 0001, in1 ¼ 0011, and in0 ¼ 1000, respectively. An AND3 output gate generates out ¼ 1 every time that the three sequences are detected.

4.6.2

Counters

Counters constitute another family of commonly used sequential components. An m-state counter (or mod m counter) is a Moore sequential circuit without external input and with an n-bit output q that is also its internal state and represents a natural belonging to the set {0, 1, . . ., m 1}. At each clock pulse the internal state is increased or decreased. Thus the next state equation is qΔ ¼ ðq þ 1Þ mod mðup counterÞ or qΔ ¼ ðq 1Þmod mðdown counterÞ:

ð4:11Þ

Thus counters generate cyclic sequences of states. In the case of a mod m up counter the generated sequence is . . . 0 1 . . . m 2 m 1 0 1 . . . . Definitions 4.1 • An n-bit binary up counter has m ¼ 2n states encoded according to the binary numeration system. If n ¼ 3 it generates the following sequence: 000 001 010 011 100 101 110 111 000 001 . . . . • An n-bit binary down counter has m ¼ 2n states encoded according to the binary numeration system. If n ¼ 3 it generates the following sequence: 000 111 110 101 100 011 010 001 000 111 ... . • A binary coded decimal (BCD) up counter has ten states encoded according to the binary numeration system (BCD code). It generates the following sequence: 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 0000 0001 . . . . A BCD down counter is defined in a similar way.

102

4

Fig. 4.35 n-Bit up counter

Sequential Circuits

q = qn-1qn-2··· q0 (q + cyIN) mod m

cyIN (EN)

EN reset

qΔ

reset clock

n-bit register

n-bit counter

···· q

qn-1 qn-2

a.

Fig. 4.36 n-Bit half adder

qn-1 cyOUT

q0 b.

HA qn-1Δ

qn-2 cyn-1

HA

q1 cyn-2

···

cy2

qn-2Δ

HA q1Δ

q0 cy1

HA

cyIN

q0Δ

• An n-bit Gray counter has m ¼ 2n states encoded in such a way that two successive states differ in only one position (one bit). For example, with n ¼ 3, a Gray counter sequence is 000 010 110 100 101 111 011 001 000 010 . . . . • Bidirectional counters have a control input U/D (up/down) that defines the counting direction (up or down). The general structure of a counter is a direct consequence of its definition. If m is an n-bit number, then an m-state up counter consists of an n-bit register that stores the internal state q and of a modulo m adder that computes (q + 1) mod m. In Fig. 4.35a a 1-operand adder (also called half adder) with a carry input cyIN is used. The carry input can be used as an enable (EN) control input: qΔ ¼ EN ½ðq þ 1Þmod m þ EN q:

ð4:12Þ

The corresponding symbol is shown in Fig. 4.35b. If m ¼ 2n then the mod m adder of Fig. 4.35a can be implemented by the circuit of Fig. 4.36 that consists of n 1-bit half adders. Each of them computes qi Δ ¼ qi cyi , cyiþ1 ¼ qi cyi :

ð4:13Þ

Observe that cyOUT could be used to enable another counter so as to generate a 2n-bit counter (22n ¼ m2 states) with two n-bit counters. Example 4.7 According to (4.13) with cy0 ¼ cyIN ¼ 1 the equations of a 3-bit up counter are q0 Δ ¼ q 0 1 ¼ q0 , q1 Δ ¼ q1 q0 , q2 Δ ¼ q2 q1 q0 ; to which corresponds the circuit of Fig. 4.37. Apart from reset and EN (Fig. 4.35) other control inputs can be defined, for example OE (output enable) as in the case of parallel registers (Fig. 4.26). An additional state output TC (terminal count) can also be defined: it is equal to 1 if, and only if, the current state q ¼ m 1. This signal is used to interconnect counters in series. If m ¼ 2n then (Figs. 4.36 and 4.37) TC ¼ cyOUT.

4.6

Sequential Components

103

Fig. 4.37 3-Bit up counter

q2Δ

D

q1Δ

Q

D

>

q 0Δ

Q

D

>resetQ

Q reset

Q

>resetQ

clock reset

q2

Fig. 4.38 3-Bit up counter with active-low OE and with TC

q1

q0

TC

q 2Δ

D

q1 Δ

Q

>

D

q0 Δ

Q

>resetQ

Q reset

D

Q

>resetQ

clock reset

OE

q2

Fig. 4.39 Bidirectional nbit counter

q1

q = qn-1qn-2··· q0 (q ± b_cIN) mod m

qΔ reset clock

U/D b_cIN (EN)

q0

U/D EN n-bit bidirectional counter reset ····

n-bit register

q0

qn-1 qn-2 b. q

a.

Example 4.8 In Fig. 4.38 an active-low OE control input and a TC output are added to the counter of Fig. 4.37. TC ¼ 1 when q ¼ 7 (q2 ¼ q1 ¼ q0 ¼ 1). To implement a bidirectional (up/down) counter, the adder of Fig. 4.35 is replaced by an addersubtractor (Fig. 4.39a). An U/D (up/down) control input permits to choose between addition (U/D ¼ 0) and subtraction (U/D ¼ 1). Input b_cIN is an incoming carry or borrow that can be used to enable the counter. Thus

104

4

Fig. 4.40 State transition graph of a bidirectional 3-bit counter

Sequential Circuits

000 U/D = 1

U/D = 1 U/D = 0

001 U/D = 1

U/D = 0 010 U/D = 0

U/D = 1 …

U/D = 0

U/D = 1 111

U/D = 0

inn-1inn-2

(q+cyIN) mod m

qΔ 0

reset clock

cyIN(EN)

in = inn-1 inn-2 ··· in0 1

load

in0 ····

q = qn-1 qn-2 ··· q0

load EN n-bit programmable reset counter ····

n-bit register

qn-1 qn-1 a.

q0 b.

q

Fig. 4.41 Counter with parallel load

qΔ ¼ EN U=D ½ðq þ 1Þ mod m þ EN U=D ½ðq 1Þmod m þ EN q:

ð4:14Þ

The corresponding symbol is shown in Fig. 4.39b. As an example, the state transition graph of Fig. 4.40 defines a 3-bit bidirectional counter without EN control input (b_cIN ¼ 1). In some applications it is necessary to define counters whose internal state can be loaded from an external input. Examples of applications are programmable timers and microprocessor program counters. An example of programmable counter with parallel load is shown in Fig. 4.41a. An n-bit MUX2-1 permits to write into the state register either qΔ or an external input in. If load ¼ 0 it works as an up counter, and when load ¼ 1 the next internal state is in. Thus qΔ ¼ EN load ½ðq þ 1Þmod m þ EN load in þ EN q:

ð4:15Þ

The corresponding symbol is shown in Fig. 4.41b. Comment 4.3 Control inputs reset and load permit to change the normal counter sequence: • If the counter is made up of flip-flops with set and reset inputs, an external reset command can change the internal state to any value (depending on the connection of the external reset signal to

4.6

Sequential Components

105

individual flip-flop set or reset inputs), but always the same value, and this operation is asynchronous. • The load command permits to change the internal state to any value defined by the external input in, and this operation is synchronous. In Fig. 4.42 a counter with both asynchronous and synchronous reset is shown: the reset control input sets the internal state to 0 in an asynchronous way while the synch_reset input sets the internal state to 0 in a synchronous way. Some typical applications using counters are now described. A first application is the implementation of timers. Consider an example: Synthesize a circuit with a 1-bit output z that generates a positive pulse on z every 5 s. Assume that a 1 kHz oscillator is available. The circuit is shown in Fig. 4.43. It consists of the oscillator and of a mod 5000 counter with state output TC (terminal count). The oscillator period is equal to 1 ms. Thus a mod 5000 counter generates a TC pulse every 5000 ms that is 5 s. A second application is the implementation of systems that count events. As an example a circuit that counts the number of 1s in a binary sequence is shown in Fig. 4.44a. It is assumed that the binary sequence is synchronized by a clock signal. This circuit is an up counter controlled by the same clock signal. The binary sequence is inputted to the EN control input and the counter output gives the number of 1s within the input sequence. Every time that a 1 is inputted, the counter is enabled and one unit is added to number. An example is shown in Fig. 4.44b.

Fig. 4.42 Asynchronous and synchronous reset

0

0

0 ····

load EN n-bit programmable reset counter

synch_reset reset

···· qn-1 qn-2

Fig. 4.43 Timer

reset 1kHz osc.

sequence clock

q0

reset EN n-bit counter

number a.

Fig. 4.44 Number of 1’s counter

z (T= 5 s)

mod 5000 counter TC

clock sequence

0

number

0

1

1 1

0 2 b.

1

1 3

1 4

0 5

106

4

Sequential Circuits

The 1-bit counter of Fig. 4.45a is a frequency divider. On each positive edge of in, connected to the clock input, the current value of out ¼ Q is replaced by its inverse Q (Fig. 4.45b). Thus, out is a square wave whose frequency is half the frequency of the input in frequency. A last example of application is the implementation of circuits that generate predefined sequences. For example, to implement a circuit that repeatedly generates the sequence 10010101 a 3-bit mod 8 counter and a combinational circuit that computes a 3-variable switching function out1 are used (Fig. 4.46a). Function out1 (Table 4.8) associates a bit of the desired output sequence to each counter state. Another example is given in Fig. 4.46b and Table 4.8. This circuit repeatedly generates the sequence 100101. It consists of a mod 6 counter and a combinational circuit that computes a 3-variable switching function out2 (Table 4.8). The mod 6 counter of Fig. 4.46b can be synthesized as shown in Fig. 4.35a with m ¼ 6 and cyIN ¼ 1. The combinational circuit that computes (q + 1) mod 6 is defined by the following truth table (Table 4.9) and can be synthesized using the methods proposed in Chap. 2. Fig. 4.45 Frequency divider by 2

reset reset D Q

in

in out

out

Q

b.

a.

Fig. 4.46 Sequence generators

reset

clock

reset mod 8 counter

clock

mod 6 counter

q

q

combinational circuit 1

Table 4.8 Truth tables of out1 and out2

q 000 001 010 011 100 101 110 111

combinational circuit 2

out1

out2

a.

b.

out1 1 0 0 1 0 1 0 1

out2 1 0 0 1 0 1 – –

4.6

Sequential Components

107

Table 4.9 Mod 6 addition

qΔ 001 010 011 100 101 000 -----

q 000 001 010 011 100 101 110 111

Fig. 4.47 Memory structure

Word lines

Bit lines

a1 a0

Address decoder

Address Bus

0

1

2 to every cell

3

R/W

Read/Write circuitry

d5

d4

d3

d2

d1

d0

Data Bus

4.6.3

Memories

Memories are essential components of any digital system. They have the capacity to store a large number of data. Functionally they are equivalent to a set of registers that can be accessed individually, either to write a new data or to read a previously stored data.

4.6.3.1 Types of Memories A generic memory structure is shown in Fig. 4.47. It is an array of small cells; each of them stores a bit. This array is logically organized as a set of rows, where each row stores a word. In the example of Fig. 4.47 there are four words and each of them has six bits. The selection of a particular word, either to read or to write, is done by the address inputs. In this example, to select a word among four words, two bits a1 and a0 connected to an address bus are used. An address decoder generates the row selection signals (the word lines). For example, if a1 a0 ¼ 10 (2 in decimal) then word number 2 is selected. On the other hand, the bidirectional (input/output) data are connected to the bit lines. Thus, if word number 2 is selected by the address inputs, then d5 is connected to bit number 5 of word number 2, d4 is connected to bit number 4 of word number 2, and so on. A control input R/W (read/ write) defines the operation, for example write if R/W ¼ 0 and read if R/W ¼ 1.

108

4

Sequential Circuits

Fig. 4.48 Types of memories Non-volatile

Volatile

Read/Write under special conditions

Read/Write

DRAM

SRAM

EPROM

flash

Read only

mask prog. ROM

OTP ROM

EEPROM

Storage permanence

Life of product

Mask-programmed ROM OTP ROM

EPROM EEPROM

Tens of years Battery life (10 years)

Ideal memory

In-system programmable

FLASH NVRAM

Nonvolatile

SRAM/DRAM Near zero during fabrication

external external external programmer, programmer,programmer one time thousands or in-system, of cycles thousands of cycles

external programmer or in-system, block-oriented write, thousands of cycles

in-system, fast write, unlimited cycles

Write ability

Fig. 4.49 Commercial memory types

A list of the main types of memories is given in Fig. 4.48. A first classification criterion is volatility: volatile memories lose their contents when the power supply is turned off while nonvolatile memories do not. Within nonvolatile memories there are read-only memories (ROM) and read/write memories. ROM are programmed either at manufacturing time (mask programmable ROM) or by the user but only one time (OTP ¼ one-time programmable ROM). Other nonvolatile memories can be programmed several times by the user: erasable programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), and flash memories (a block-oriented EEPROM). Volatile memories can be read and written. They are called random access memories (RAM) because the access time to a particular stored data does not depend on the particular location of the data (as a matter of fact a ROM has the same characteristic). There are two families: static RAM (SRAM) and dynamic RAM (DRAM). The diagram of Fig. 4.49 shows different types of commercial memories classified according to their storage permanence (the maximum time without loss of information) and their ability to be programmed. Some memories can be programmed within the system to which they are connected (for example a printed circuit board); others must be programmed outside the system using a device called memory programmer. Observe also that EPROM, EEPROM, and flash memories can be reprogrammed a large number of times (thousands) but not an infinite number of times. With regard to the time necessary to write a data, SRAM and DRAM are much faster than nonvolatile memories. Nonvolatile RAM (NVRAM) is a battery-powered SRAM.

4.6

Sequential Components

109

An ideal memory should have storage permanence equal to its lifetime, with the ability to be loaded as many times as necessary during its lifetime. The extreme cases are, on the one hand, mask programmable ROM that have the largest storage permanence but no reprogramming possibility and, on the other hand, static and dynamic RAM that have full programming ability.

4.6.3.2 Random Access Memories RAMs are volatile memories. They lose their contents when the power supply is turned off. Their structure is the general one of Fig. 4.47. Under the control of the R/W control input, read and write operations can be executed: if the address bus a ¼ i and R/W ¼ 1 then the data stored in word number i is transmitted to the data bus d; if the address bus a ¼ i and R/W ¼ 0 then the current value of bus d is stored in word number i. Consider a RAM with n address bits that store m-bit words (in Fig. 4.47 n ¼ 2 and m ¼ 6). Its behavior is defined by the following piece of program: i ¼ conv(a); if R/W ¼ 0 then d ¼ word(i); else word(i) ¼ d; end if;

in which conv is a conversion function that translates an n-bit vector (an address) to a natural belonging to the interval 0 to 2n 1 (this is the function of the address decoder of Fig. 4.47) and word is a vector of 2n m-bit vectors (an array). A typical SRAM cell is shown in Fig. 4.50. It consists of a D-type latch, some additional control gates, and a tristate buffer. The word line is an output of the address decoder and the bit line is connected to the data bus through the read/write circuitry (Fig. 4.47). When the word line is equal to 1 and R/W ¼ 0 then the load input is equal to 1, the tristate output buffer is in high impedance (state Z, disconnected) and the value of the bit line connected to D is stored within the latch. When the word line is equal to 1 and R/W ¼ 1 then the load input is equal to 0 and the value of Q is transmitted to the bit line through the tristate buffer. Modern SRAM chips have a capacity of up to 64 megabits. Their read time is between 10 and 100 nanoseconds, depending on their size. Their power consumption is smaller than the power consumption of DRAM chips. An example of very simple dynamic RAM (DRAM) cell is shown in Fig. 4.51a. It is made up of a very small capacitor and a transistor used as a switch. When the word line is selected, the cell capacitor is connected to the bit line through the transistor. In the case of a write operation, the bit line is connected to an external data input and the electrical charge stored in the cell capacitor is proportional to the input logic level (0 or 1). This electrical charge constitutes the stored information. However this information must be periodically refreshed because, in the contrary case, it would be

bit line

Fig. 4.50 SRAM cell

R/W

D Q load Q

word line

Sequential Circuits

bit line

4 bit line

110

pre-charge + comparison R/W

word line

Read/Write circuitry + refresh

R/W

a. b. di

Bus de datos c.

Fig. 4.51 DRAM cell

quickly lost due to the leakage currents. In the case of a read operation, the bit line is connected to a data output. The problem is that the cell capacitor is much smaller than the bit line equivalent capacitor, so that when connecting the cell capacitor to the bit line, the stored electrical charge practically disappears. Thus, the read operation is destructive. Some additional electronic circuitry is used to sense the very small voltage variations on the bit line when a read operation is executed: before the connection of the bit line to a cell capacitor, it is pre-charged with an intermediate value (between levels 0 and 1); then an analog comparator is used to sense the very small voltage variation on the bit line in order to decide whether the stored information was 0 or 1 (Fig. 4.51b). Once the stored information has been read (and thus destroyed) it is rewritten into the original memory location. The data bus interface of Fig. 4.51c includes the analog comparators (one per bit line) as well as the logic circuitry in charge of rewriting the read data. To refresh the memory contents, all memory locations are periodically read (and thus rewritten). Modern DRAM chips have a capacity of up to 2 gigabits that is a much larger capacity than SRAM chips. On the other hand they are slower than SRAM and have higher power consumption.

4.6.3.3 Read-Only Memories Within ROM a distinction must be done between mask programmable ROM whose contents are programmed at manufacturing time and programmable ROM (PROM) that can be programmed by the user, but only one time. Other names are one-time programmable (OTP) or write-once memories. Their logic structure (Fig. 4.52) is also an array of cells. Each cell may connect, or not, a word line to a bit line. In the case of a mask programmable ROM the programming consists in drawing some of the word-line-to-bit-line connections in the mask that corresponds to one of the metal levels (Sect. 7.1). In the case of a user programmable ROM, all connections are initially enabled and some of them can be disabled by the user (fuse technologies) or none of them is previously enabled and some of them can be enabled by the user (anti-fuse technologies). 4.6.3.4 Reprogrammable ROM Reprogrammable ROMs are user programmable ROM whose contents can be reprogrammed several times. Their logic structure is the same as that of non-reprogrammable ROMs but the word-line-tobit-line connections are floating-gate transistors instead of metal connections. There are three types of reprogrammable ROM: • EPROM: Their contents are erased by exposing the chip to ultraviolet (UV) radiation; for that, the chip must be removed from the system (for example the printed circuit board) in which it is used; the chip package must have a window to let the UV light reach the floating-gate transistors; an external programmer is used to (re)program the memory.

4.6

Sequential Components

111

Fig. 4.52 Read-only memory structure

word lines bit lines

address decoder

address Bus

0

a1 a0

1 2

3

read circuitry d5

d4

d3

d2

R

d1

d0

data bus

Fig. 4.53 Implementation of a 1 kB memory

R/W ME

R/W ME

256 x 4

1,024 x 8 OE

A 10 a.

OE

D 8

A

D 4

8

b.

• EEPROM: Their contents are selectively erased, one word at a time, using a specific higher voltage; the chip must not be removed from the system; the (re)programming circuitry is included within the chip. • Flash memories: This is an EEPROM-type memory with better performance; in particular, block operations instead of one-word-at-a-time operations are performed; they are used in many applications, for example pen drives, memory cards, solid-state drives, and many others.

4.6.3.5 Example of Memory Bank Memory banks implement large memories with memory chips whose capacity is smaller than the desired capacity. As an example, consider the implementation of the memory of Fig. 4.53a with a capacity of 1 kB (1024 words, 8 bits per word) using for that the memory chip of Fig. 4.53b that can store 256 4-bit words. Thus, eight memory chips must be used (1024 8 ¼ (256 4) 8). Let a9 a8 a7 . . . a0 be the address bits of the memory (Fig. 4.53a) to be implemented. The 1024word addressing space is decomposed into four blocks of 256 words, using for that bits a9 and a8. Each block of 256 words is implemented with two chips working in parallel (Fig. 4.54). To select one of the four blocks the OE (output enable) control inputs are used:

112

4

Fig. 4.54 Address space

Sequential Circuits

bits a9 0 0 … 0 0 0 … 0 1 1 … 1 1 1 … 1

a8 0 0

7

a7 a6 a5 a4 a3 a2 a1 a0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 … 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 … 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 … 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 … 1 1 1 1 1 1 1 1

0 1 1 1 0 0 0 1 1 1

R/W ME

chip 0

R/W ME

chip 1

6

bits 5

4

3

2

1

Chip 0

Chip 4

Chip 1

Chip 5

Chip 2

Chip 6

Chip 3

Chip 7

R/W ME

chip 2

0

R/W ME

chip 3

chip 0

256 x 4 OE a7 a6 ··· a0

A

256 x 4

D

OE0

OE

A

256 x 4

D

OE

OE1

A

D

OE2

256 x 4 OE

A

D

OE3 d7 d6 d5 d4

a9 a8

R/W ME

chip 4

OE4

A

D

OE OE5

A

D

R/W ME

chip 6

256 x 4

256 x 4 OE

R/W ME

chip 5

256 x 4 OE

A

D

OE6

R/W ME

chip 7

256 x 4 OE

A

D

OE7 d3 d 2 d 1 d 0

Fig. 4.55 Memory bank

OE0 ¼ OE4 ¼ a9 a8 , OE1 ¼ OE5 ¼ a9 a8 , OE2 ¼ OE6 ¼ a9 a8 , OE0 ¼ OE4 ¼ a9 a9 :

ð4:16Þ

The complete memory bank is shown in Fig. 4.55. A 2-to-4 address decoder generates the eight output enable functions (4.16). More information about memories can be found in classical books such as Weste and Harris (2010) or Rabaey et al. (2003).

4.7

Sequential Implementation of Algorithms

4.7

113

Sequential Implementation of Algorithms

In Sect. 2.8 the relation between algorithms (programming language structures) and combinational circuits has been commented. This relation exists between algorithms and digital circuits in general, not only combinational circuits. As a matter of fact the understanding and the use of this relation is a basic aspect of this course. Some systems are sequential by nature because their specification includes an explicit or implicit reference to successive time intervals. Some examples have been seen before: generation and detection of sequences (Examples 4.1 and 4.2), control of sequences of events (Sects. 4.1 and 4.3.2), data transmission (Fig. 4.32), timers (Fig. 4.43), and others. However algorithms without any time reference can also be implemented by sequential circuits.

4.7.1

A First Example

As a first example of synthesis of a sequential circuit from an algorithm, a circuit that computes the pﬃﬃﬃ integer square root of a natural is designed. Given a natural x it computes r ¼ b xc where bac stands for the greatest natural smaller than or equal to a. The following algorithm computes a set of successive pairs (r, s) such that s ¼ (r + 1)2. Algorithm 4.1 r0 ¼ 0; s0 ¼ 1; for i in 0 to N loop si+1 ¼ si + 2(ri + 1) + 1; ri+1 ¼ ri + 1; end loop;

It uses the following relation: ðr þ 2Þ2 ¼ ððr þ 1Þ þ 1Þ2 ¼ ðr þ 1Þ2 þ 2 ðr þ 1Þ þ 1 ¼ s þ 2 ðr þ 1Þ þ 1: Assume that x > 0. Initially s0 ¼ 1 x. Then execute the loop as long as si x. When si x and si+1 > x then r iþ1 2 ¼ ðr i þ 1Þ2 ¼ si x and ðr iþ1 þ 1Þ2 ¼ siþ1 > x: pﬃﬃﬃ Thus ri+1 is the greatest natural smaller than or equal to x so that r ¼ ri+1. The following naı¨ve algorithm computes r. Algorithm 4.2 Square Root r ¼ 0; s ¼ 1; while s x loop s ¼ s + 2(r+1) + 1; r ¼ r + 1; end loop; root ¼ r;

114

4

Table 4.10 Example of square root computation

r 0 1 2 3 4 5 6

0 1

s 47 true true true true true true false

s 1 4 9 16 25 36 49

x

max(1,root) max(2,root) max(3,root) max(4,root)

x

Sequential Circuits

··· loop body

loop body

loop body

loop body

···

a.

loop body

root

r s

loop body

next_r next_s

b.

Fig. 4.56 Square root implementation: iterative circuit

As an example, if x ¼ 47 the successive values of r and s are given in Table 4.10. The result is r ¼ 6 (the first value of r such that condition s x does not hold true). This is obviously not a good algorithm. The number of steps is equal to the square root r. Thus, for great values of x ﬃ 2n, the number of steps is r ﬃ 2n/2. It is used for didactic purposes. Algorithm 4.2 is a loop so that a first option could be an iterative combinational circuit (Sect. 2.8.3). In this case the number of executions of the loop body is not known in advance; it depends on a condition (s x) computed before each loop body execution. For that reason the algorithm must be slightly modified: once the condition stops holding true, the values of s and r do not change any more. Algorithm 4.3 Square Root, Version 2 r ¼ 0; s ¼ 1; loop if s x then next_s ¼ s + 2(r+1) + 1; next_r ¼ r + 1; else next_s ¼ s; next_r ¼ r; end if; s ¼ next_s; r ¼ next_r; end loop; root ¼ r;

The corresponding iterative circuit is shown in Fig. 4.56a. The iterative cell (Fig. 4.56b) executes the loop body, and the connections between adjacent cells implement the instructions s ¼ next_s and r ¼ next_r. The first output of circuit number i is either number i or the square root of x, and once the square root has been computed the first output value does not change any more and is equal to the square root r. Thus the number of cells must be equal to the maximum value of r. But this is a very large number. As an example, if n ¼ 32, then x < 232 and r < 216 ¼ 65,536 so that the number of cells must be equal to 65,535, obviously a too large number of cells.

4.7

Sequential Implementation of Algorithms

115

A better idea is a sequential implementation. It takes advantage of the fact that a sequential circuit not only implements combinational functions but also includes memory elements and, thanks to the use of a synchronization signal, permits to divide the time into time intervals to which corresponds the execution of different operations. In this example two memory elements store r and s, respectively, and the time is divided into time intervals to which are associated groups of operations: • First time interval (initial value of the memory elements): r ¼ 0; s ¼ 1;

• Second, third, fourth, . . . time interval: if s x then next_s ¼ s + 2(r+1) + 1; next_r ¼ r + 1; else next_s ¼ s; next_r ¼ r; end if;

In the following algorithm a variable end has been added; it detects the end of the square root computation. Algorithm 4.4 Square Root, Version 3 r ¼ 0; s ¼ 1; -- (initial values of memory elements) loop if s x then next_s ¼ s + 2(r+1) + 1; next_r ¼ r + 1; end ¼ FALSE; else next_s ¼ s; next_r ¼ r; end ¼ TRUE; end if; s ¼ next_s; r ¼ next_r; --synchronization end loop; root x then end is equal to true and the values of r, s, and end will not change any more. The synchronization input clock will be used to modify the value of r and s by replacing their current values by their next values next_r and next_s. The sequential circuit of Fig. 4.57 implements Algorithm 4.4. Two registers store r and s. On reset ¼ 1 the initial values are loaded: r ¼ 0, and s ¼ 1. The combinational circuit implements functions next_r, next_s and end as defined by the loop body: if s x then next_s ¼ s + 2(r+1) + 1; next_r ¼ r + 1; end ¼ FALSE; else next_s ¼ s; next_r ¼ r; end ¼ TRUE; end if;

On each clock pulse the current values of r and s are replaced by next_r and next_s, respectively.

116

4

Fig. 4.57 Square root implementation: sequential circuit

x comb. circuit

Sequential Circuits

clock reset next_r next_s

r registers s

root

end

Comment 4.4 Figure 4.57 is a sequential circuit whose internal states are all pairs {(r, s), r < 2n/2, s ¼ (r + 1)2} and whose input values are all natural x < 2n. The corresponding transition state graph would have 2n/ 2 vertices and 2n edges per vertex: for example (n ¼ 32) 65,536 vertices and 4,294,967,296 edges per vertex. Obviously the specification of this sequential circuit by means of a graph or by means of tables doesn’t make sense. This is an example of system that is described by an algorithm, not by an explicit behavior description.

4.7.2

Combinational vs. Sequential Implementation

In many cases an algorithm can be implemented either by a combinational or by a sequential circuit. In the case of the example of the preceding section (Sect. 4.7.1) the choice of a sequential circuit was quite evident. In general, what are the criteria that must be used? Conceptually any well-defined algorithm, without time references, can be implemented by a combinational circuit. For that • Execute the algorithm using all input variable value combinations and store the corresponding output variable values in a table. • Generate the corresponding combinational circuit. Except in the case of very simple digital systems, this is only a theoretical proof that a combinational circuit could be defined. The obtained truth table generally is enormous so that this method would only make sense if the designer has an unbounded space to implement the circuit (for example unbounded silicon area, Chap. 7) and has an unbounded time to develop the circuit. A better method is to directly translate the algorithm instructions to circuits as was done in Sect. 2.8, but even with this method the result might be a too large circuit as shown in the preceding section (Sect. 4.7.1). In conclusion, there are algorithms that cannot reasonably be implemented by a combinational circuit. As already mentioned above, a sequential circuit has the capacity to implement switching functions but also the capacity to store data within memory elements. Furthermore, the existence of a synchronization signal permits to divide the time into time intervals and to assign different time intervals to different operations. In particular, in the case of loop instructions, the availability of memory elements allows to substitute N identical components by one component that iteratively executes the loop body, so that space (silicon area) is replaced by time. But this method is not restricted to the case of iterations. As an example, consider the following algorithm that consists of four instructions:

4.7

Sequential Implementation of Algorithms

117 x

clock reset

step

next_step

x

f1

X1

f2

X2

f3

X3

f4

R

combinat. circuit (case)

next_R

memory elements

R

a.

b.

Fig. 4.58 Combinational vs. sequential implementations 0: X1 ¼ f1(x); 1: X2 ¼ f2(X1); 2: X3 ¼ f3(X2); 3: R ¼ f4(X3);

It can be implemented by a combinational circuit (Fig. 4.58a) made up of four components that implement functions f1, f2, f3, and f4, respectively. Assume that all data x, X1, X2, X3, and R have the same number n of bits. Then the preceding algorithm could also be implemented by a sequential circuit. For that the algorithm is modified and a new variable step, whose values belong to {0, 1, 2, 3, 4}, is added: step ¼ 0; -- (initial value of the step identifier) loop case step is when 0 ¼> R ¼ f1(x); step ¼ 1; when 1 ¼> R ¼ f2(R); step ¼ 2; when 2 ¼> R ¼ f3(R); step ¼ 3; when 3 ¼> R ¼ f4(R); step ¼ 4; when 4 ¼> step ¼ 4; end case; end loop;

The sequential circuit of Fig. 4.58b implements the modified algorithm. It has two memory elements that store R and step (encoded) and a combinational circuit defined by the following case instruction: case step is when 0 ¼> next_R ¼ f1(x); next_step ¼ 1; when 1 ¼> next_R ¼ f2(R); next_step ¼ 2; when 2 ¼> next_R ¼ f3(R); next_step ¼ 3; when 3 ¼> next_R ¼ f4(R); next_step ¼ 4; when 4 ¼> next_step ¼ 4; end case;

When reset ¼ 1 the initial value of step is set to 0. On each clock pulse R and step are replaced by next_step and next_R.

118

4

Sequential Circuits

What implementation is better? Let ci and ti be the cost and computation time of the component that computes fi. Then the cost Ccomb and computation time Tcomb of the combinational circuit are Ccomb ¼ c1 þ c2 þ c3 þ c4 and T comb ¼ t1 þ t2 þ t3 þ t4 :

ð4:17Þ

The cost Csequ and the computation time Tsequ of the sequential circuit are equal to Csequ ¼ Ccase þ Creg and T sequ ¼ 4 T clock ;

ð4:18Þ

where Ccase is the cost of the circuit that implements the combinational circuit of Fig. 4.58b, Creg is the cost of the registers that store R and step, and Tclock is the clock signal period. The combinational circuit of Fig. 4.58b is a programmable resource that, under the control of the step variable, computes f1, f2, f3, or f4. There are two extreme cases: • If no circuit part can be shared between some of those functions, then this programmable resource implicitly includes the four components of Fig. 4.58a plus some additional control circuit; its cost is greater than the sum c1 + c2 + c3 + c4 ¼ Ccomb and Tclock must be greater than the computation time of the slowest component; thus Tsequ ¼ 4Tclock is greater than Tcomb ¼ t1 + t2 + t3 + t4. • The other extreme case is when f1 ¼ f2 ¼ f3 ¼ f4 ¼ f; then the algorithm is an iteration; let c and t be the cost and computation time of the component that implements f; then Ccomb ¼ 4c, Tcomb ¼ 4t, Csequ ¼ c + Creg, and Tsequ ¼ 4Tclock where Tclock must be greater than t; if the register cost is much smaller than c and if the clock period is almost equal to t then Csequ ﬃ c ¼ Ccomb/4 and Tsequ ﬃ 4t ¼ Tcomb. In the second case (iteration) the sequential implementation is generally better than the combinational one. In other cases, it depends on the possibility to share, or not, some circuit parts between functions f1, f2, f3, and f4. In conclusion, algorithms can be implemented by circuits that consist of • Memory elements that store variables and time interval identifiers (step in the preceding example) • Combinational components that execute operations depending on the particular time interval, with operands that are internally stored variables and input signals The circuit structure is shown in Fig. 4.59. It is a sequential circuit whose internal states correspond to combinations of variable values and step identifier values.

Fig. 4.59 Structure of a sequential circuit that implements an algorithm

inputs

outputs

variables comb. step circuits

memory elements

clock reset

4.8

Finite-State Machines

4.8

119

Finite-State Machines

Finite-state machines (FSM) are algebraic models of sequential circuits. As a matter of fact the physical implementation of a finite-state machine is a sequential circuit so that part of this section repeats subjects already studied in previous sections.

4.8.1

Definition

Like a sequential circuit, a finite-state machine has input signals, output signals, and internal states. Three finite sets are defined: • Σ (input states, input alphabet) is the set of values of the input signals. • Ω (output states, output alphabet) is the set of values of the output signals. • S is the set of internal states. The working of the finite-state machine is specified by two functions f (next-state function) and h (output function): • f: S Σ ! S associates an internal state to every pair (internal state, input state). • h: S Σ ! Ω associates an output state to every pair (internal state, input state). Any sequential circuit can be modelled by a finite-state machine. Example 4.9 A 3-bit up counter, with EN (count enable) control input, can be modelled by a finitestate machine: it has one binary input EN; three binary outputs q2, q1, and q0; and eight internal states, so that Σ ¼ f0, 1g, Ω ¼ f000, 001, 010, 011, 100, 101, 110, 111g, S ¼ f0, 1, 2, 3, 4, 5, 6, 7g, f ðs, 0Þ ¼ s and f ðs, 1Þ ¼ s þ 1 mod 8, 8s 2 S, hðs, 0Þ ¼ hðs, 1Þ ¼ binary encoded ðsÞ, 8s 2 S; where binary_encoded(s) is the binary representation of s. The corresponding sequential circuit is shown in Fig. 4.60: it consists of a 3-bit register and a combinational circuit that implements f, for example a multiplexer and a circuit that computes q + 1. Fig. 4.60 3-Bit counter

reset

+1

1 0

EN

3

3

next state

current state clock

q

120

4

Sequential Circuits

Thus, finite-state machines can be seen as a formal way to specify the behavior of a sequential circuit. Nevertheless, they are mainly used to define the working of circuits that control sequences of operations rather than to describe the operations themselves. The difference between the Moore and Mealy models has already been seen before (Sect. 4.3.1). In terms of finite-state machines, the difference is the output function h definition. In the case of the Moore model h:S!Ω

ð4:19Þ

h : S Σ ! Ω:

ð4:20Þ

and in the case of the Mealy model

In the first case the corresponding circuit structure is shown in Fig. 4.61. A first combinational circuit computes the next state in function of the current state and of the input. Assume that its propagation time is equal to t1 seconds. Another combinational circuit computes the output state in function of the current state. Assume that its propagation time is equal to t2 seconds. Assume also that the input signal comes from another synchronized circuit and is stable tSUinput seconds (SU means Set Up) after the active clock edge. A chronogram of the signal values during a state transition is shown in Fig. 4.62. The register delay is assumed to be negligible so that the new current state value (register output) is stable at the beginning of the clock cycle. The output will be stable after t2 seconds. The input is stable after tSUinput seconds (tSUinput could be the value t2 of another finite-state machine). The next state will be stable tSUinput + t1 seconds later. In conclusion, the clock period must be greater than t2 and tSUinput + t1: T clock > max tSUinput þ t1 , t2 : ð4:21Þ This is an example of computation of the minimum permitted clock period and thus of the maximum clock frequency.

combinational circuit1 (t1)

next state current state

reset

input state

(tSUinput)

combinational circuit2 (t2) output state

clock

Fig. 4.61 Moore model: sequential circuit structure

Fig. 4.62 Moore model: chronogram

clock current state input state output state next state

tSUinput t2

t1

4.8

Finite-State Machines

121

combinational circuit1 (t1)

next state current state

reset

input state (tSUinput)

combinational circuit2 (t2) output state

clock

Fig. 4.63 Mealy model: sequential circuit structure

Fig. 4.64 Mealy model: chronogram

clock current state input state output state next state

tSUinput

t2 t1

The circuit structure that corresponds to the Mealy model is shown in Fig. 4.63. A first combinational circuit computes the next state in function of the current state and of the input. Assume that its propagation time is equal to t1 seconds. Another combinational circuit computes the output state in function of the current state and of the input. Assume that its propagation time is equal to t2 seconds. Assume also that the input signal comes from another synchronized circuit and is stable tSUinput seconds (SU means Set Up) after the active clock edge. A chronogram of the signal values during a state transition is shown in Fig. 4.64. As before, the register delay is assumed to be negligible so that the new current state value is stable at the beginning of the clock cycle. The input is stable after tSUinput seconds. The next state will be stable tSUinput + t1 seconds later and the output will be stable tSUinput + t2 seconds later. In conclusion, the clock period must be greater than tSUinput + t2 and tSUinput + t1: T clock > max tSUinput þ t1 , tSUinput þ t2 :

4.8.2

ð4:22Þ

VHDL Model

All along this course a formal language (pseudo-code), very similar to VHDL, has been used to describe algorithms. In this section, complete executable VHDL definitions of finite-state machines are presented. An introduction to VHDL is given in Appendix A. The structure of a Moore finite-state machine is shown in Fig. 4.61. It consists of three blocks: a combinational circuit that computes the next state, a combinational circuit that computes the output, and a register. Thus, a straightforward VHDL description consists of three processes, one for each block:

122

4

Sequential Circuits

library ieee; use ieee.std_logic_1164.all; use work.my_fsm.all; entity MooreFsm is port ( clk, reset: in std_logic; x: in std_logic_vector(N-1 downto 0); y: out std_logic_vector(M-1 downto 0) ); end MooreFsm; architecture behavior of MooreFsm is signal current_state, next_state: state; begin next_state_function: process(current_state, x) begin next_state OUT(i) ¼ A; number ¼ number + 1; when (OPERATION, i, j, k, f) ¼> X(k) ¼ f(X(i), X(j)); number ¼ number + 1; when (JUMP, N) ¼> number ¼ N; when (JUMP_POS, i, N) ¼> if X(i) > 0 then number ¼ N; else number ¼ number + 1; end if; when (JUMP_NEG, i, N) ¼> if X(i) < 0 then number ¼ N; else number ¼ number + 1; end if; end case; end loop;

5.3

Structural Specification

145

Table 5.3 Number of instructions Code and list of parameters ASSIGN_VALUE, k, A DATA_INPUT, k, j DATA_OUTPUT, i, j OUTPUT_VALUE, i, A OPERATION, i, j, k, f JUMP, N JUMP_POS, i, N JUMP_NEG, i, N

Operations Xk ¼ A Xk ¼ INj OUTi ¼ Xj OUTi ¼ A Xk ¼ f(Xi, Xj) goto N if Xi > 0 goto N if Xi < 0 goto N

Number of instructions 16 256 ¼ 4096 16 8 ¼ 128 8 16 ¼ 128 8 256 ¼ 2048 16 2 16 16 ¼ 8192 256 16 256 ¼ 4096 16 256 ¼ 4096

Comment 5.2 Table 5.2 defines eight instruction types. The number of different instructions depends on the parameter sizes. Assume that the internal memory X stores sixteen 8-bit data and that the program memory stores at most 256 instructions so that the addresses are also 8-bit vectors. Assume also that there are two different operations f. The number of instructions of each type is shown in Table 5.3: there are 16 memory elements Xi, 256 constants A, 8 input ports INi, 8 output ports OUTi, 256 addresses N, and 2 operations f. The total number is 4096 + 128 + 128 + 2048 + 8192 + 256 + 4096 + 4096 ¼ 23,040, a number greater than 214. Thus, the minimum number of bits needed to associate a different binary code to every instruction is 15.

5.3

Structural Specification

The implementation method is top-down. The first step was the definition of a functional specification (Sect. 5.2). Now, this specification will be translated to a block diagram.

5.3.1

Block Diagram

To deduce a block diagram from the functional specification (Algorithm 5.3) the following method is used: extract from the algorithm the set of processed data, the list of data transfers, and the list of data operations. The processed data are the following: • Input data: There are eight input ports INi and an input signal instruction. • Output data: There are eight output ports OUTi and an output signal number. • Internal data: There are 16 internally stored data Xi. The data transfers are the following: • Transmit the value of a memory element Xj or of a constant A to an output port. • Update number with number + 1 or with a jump address N. • Store in a memory element Xk a constant A, an input port value INj, or the result of an operation f. The operations are {f(Xi, Xj)} with all possible functions f. The proposed block diagram is shown in Fig. 5.7. It consists of five components.

146

5

IN0 IN1 ··· instruction

IN7

input selection

Synthesis of a Processor

OUT0 OUT1 ··· instruction

OUT7

output selection

(code, A, i)

(code, A, j)

to Xk Xj

register bank

instruction

{Xi}

(code, i, j, k)

Xi

instruction (code, f)

Xj

instruction (code, N)

go to

number

computation resources

f(Xi, Xj) Fig. 5.7 Block diagram

• Register bank: This component contains the set of internal memory elements {Xi}. It is a 16-word memory with a data input to Xk and two data outputs Xi and Xj. It will be implemented in such a way that within a clock period two data Xi and Xj can be read and an internal memory element Xk can be updated (written). Thus the operation Xk ¼ f(Xi, Xj) is executed in one clock cycle. This block is controlled by the instruction code and by the parameters i, j, and k. • Output selection: This component transmits to the output port OUTi the rightmost register bank output Xj or a constant A. It is controlled by the instruction code and by the parameters i and A. • Go to: It is a programmable counter: it stores the current value of number; during the execution of each instruction, it replaces number by number + 1 or by a jump address N. It is controlled by the instruction code, by the parameter N, and by the leftmost register bank output Xi whose most significant bit value (sign bit) is used in the case of conditional jumps. • Input selection: This component selects the data to be sent to the register bank, that is, an input port INj, a constant A, or the result of an operation f. It is controlled by the instruction code and by the parameters j and A. • Computation resources: This is an arithmetic unit that computes a function f with operands that are the two register bank outputs Xi and Xj. It is controlled by the instruction code and by the parameter f. The set of instruction types has already been defined (Table 5.2). All input data (INj), output data (OUTj), and internally stored data (Xi, Xj, Xk) are 8-bit vectors. It remains to define the size of the instruction parameters and the arithmetic operations: • There are eight input ports and eight output ports and the register bank stores 16 words; thus i, j, and k are 4-bit vectors. • The maximum number of instructions is 256 so that number is an 8-bit natural. With regard to the arithmetic operations, the sixteen 8-bit vectors X0 to X15 are interpreted as 2’s complement integers ((3.4) with n ¼ 8). Thus 128 Xi 127, 8i ¼ 0–15. There are two operations f: Xk ¼ (Xi + Xj) mod 256 and Xk ¼ (Xi Xj) mod 256. The instruction encoding will be defined later.

5.3

Structural Specification

147

Fig. 5.8 Input selection

IN0 IN1 ··· IN7

instruction

result

input selection

to_reg Fig. 5.9 Output selection

OUT0 OUT1 ··· OUT7 instruction

output selection

reg 5.3.2

Component Specification

Each component will be functionally described.

5.3.2.1 Input Selection To define the working of the input selection component (Fig. 5.8), extract from Algorithm 5.3 the instructions that select the data inputted to the register bank: Algorithm 5.4 Input Selection loop case instruction is when (ASSIGN_VALUE, k, A) ¼> to_reg ¼ A; when (DATA_INPUT, k, j) ¼> to_reg ¼ IN(j); when (OPERATION, i, j, k, f) ¼> to_reg ¼ result; when others ¼> to_reg ¼ don’t care; end case; end loop;

5.3.2.2 Output Selection To define the working of the output selection component (Fig. 5.9) extract from Algorithm 5.3 the instructions that select the data outputted to the output ports: Algorithm 5.5 Output Selection loop case instruction is when (DATA_OUTPUT, i, j) ¼> OUT(i) ¼ reg;

148

5

Fig. 5.10 Register bank

Synthesis of a Processor

reg_in

register bank

instruction

{Xi}

left_out

right_out

when (OUTPUT_VALUE, i, A) ¼> OUT(i) ¼ A; end case; end loop;

The output ports are registered outputs: if the executed instruction is neither DATA_OUTPUT nor OUTPUT_VALUE, or if k 6¼ i, the value of OUTk does not change.

5.3.2.3 Register Bank The register bank is a memory that stores sixteen 8-bit words (Fig. 5.10). Its working is described by the set of instructions of Algorithm 5.3 that read or write some memory elements (Xi, Xj, Xk). It is important to observe (Fig. 5.7) that in the case of the DATA_OUTPUT instruction Xj is the rightmost output of the register bank and in the case of the JUMP_POS and JUMP_NEG instructions Xi is the leftmost output of the register bank. Algorithm 5.6 Register Bank loop case instruction is when (ASSIGN_VALUE, k, A) ¼> X(k) ¼ reg_in; left_out ¼ don’t care; right_out ¼ don’t care; when (DATA_INPUT, k, j) ¼> X(k) ¼ reg_in; left_out ¼ don’t care; right_out ¼ don’t care; when (DATA_OUTPUT, i, j) ¼> right_out ¼ X(j); left_out ¼ don’t care; when (OPERATION, i, j, k, f) ¼> X(k) ¼ reg_in; left_out ¼ X(i); right_out ¼ X(j); when (JUMP_POS, i, N) ¼> left_out ¼ X(i); right_out ¼ don’t care; when (JUMP_NEG, i, N) ¼> left_out ¼ X(i); right_out ¼ don’t care;

5.3

Structural Specification

149

when others ¼> left_out ¼ don’t care; right_out ¼ don’t care; end case; end loop;

5.3.2.4 Computation Resources To define the working of the computation resources component (Fig. 5.11) extract from Algorithm 5.3 the instruction that computes f. Remember that there are only two operations: addition and difference. Algorithm 5.7 Computation Resources loop case instruction is when (OPERATION, i, j, k, f) ¼> if f ¼ addition then result ¼ (left_in + right_in) mod 256; else result ¼ (left_in - right_in) mod 256; end if; when others ¼> result ¼ don’t care; end case; end loop;

5.3.2.5 Go To This component (Fig. 5.12) is in charge of computing the address of the next instruction within the program memory: Fig. 5.11 Computation resources

left_in right_in

computation resources

instruction

result Fig. 5.12 Go to component

instruction go to

data

number

150

5

Synthesis of a Processor

Algorithm 5.8 Go To number ¼ 0; loop case instruction is when (JUMP, N) ¼> number ¼ N; when (JUMP_POS, i, N) ¼> if data > 0 then number ¼ N; else number ¼ number + 1; end if; when (JUMP_NEG, i, N) ¼> if data < 0 then number ¼ N; else number ¼ number + 1; end if; when others ¼> number ¼ number + 1; end case; end loop;

5.4

Component Implementation

The final step of this top-down implementation is the synthesis of all components that have been functionally defined in Sect. 5.3.2. Every component is implemented with logic gates, multiplexers, flip flops, and so on. A VHDL model of each component will also be generated.

5.4.1

Input Selection Component

The input and output signals of this component are shown in Fig. 5.8 and its functional specification is defined by Algorithm 5.4. It is a combinational circuit: the value of to_reg only depends on the current value of inputs instruction, IN0 to IN7, and result. Instead of inputting the complete instruction code to the component, a 2-bit input_control variable (Table 5.4) that classifies the instruction types into four categories, namely ASSIGN_VALUE, DATA_INPUT, OPERATION, and others, is defined. Once the encoding of the instructions will be defined, an instruction decoder that generates (among others) this 2-bit variable will be designed. From Algorithm 5.4 and Table 5.4 the following description is obtained: Algorithm 5.9 Input Selection Component loop case input_control is when 00 ¼> to_reg ¼ A; when 01 ¼> to_reg ¼ IN(j); when 10 ¼> to_reg ¼ result; Table 5.4 Encoded instruction types (input instructions)

Instruction type ASSIGN_VALUE DATA_INPUT OPERATION Others

input_control 00 01 10 11

5.4

Component Implementation

151

IN0 IN1

IN7 ···

IN0 IN1 ··· IN7 j A input_control

result

j

000

001

···

111

selected_port

A

result 0

input selection

input_control to_reg a.

00

01

10

11

to_reg b.

Fig. 5.13 Input selection implementation when 11 ¼> to_reg ¼ don’t care; end case; end loop;

The component inputs are input_control, A, j (parameters included in the instruction), IN0 to IN7, and result, and the component output is to_reg (Fig. 5.13a). A straightforward implementation with two multiplexers is shown in Fig. 5.13b. The following VHDL model describes the circuit of Fig. 5.13b. Its architecture consists of two processes that describe the 8-bit MUX8-1 and MUX4-1: package main_parameters is constant m: natural :¼ 8; -- m-bit processor end main_parameters; library IEEE; use IEEE.std_logic_1164.all; use work.main_parameters.all; entity input_selection is port ( IN0, IN1, IN2, IN3, IN4, IN5, IN6, IN7: in std_logic_vector(m-1 downto 0); A, result: in std_logic_vector(m-1 downto 0); j: in std_logic_vector(2 downto 0); input_control: in std_logic_vector(1 downto 0); to_reg: out std_logic_vector(m-1 downto 0) ); end input_selection; architecture structure of input_selection is signal selected_port: std_logic_vector(m-1 downto 0); begin first_mux: process(j,IN0,IN1,IN2,IN3,IN4,IN5,IN6,IN7) begin case j is when "000" ¼> selected_port selected_port selected_port selected_port to_reg to_reg to_reg to_reg ’0’); end case; end process; end structure;

5.4.2

Computation Resources

The functional specification of the computation resources component is defined by Algorithm 5.7. In fact, the working of this component when the executed instruction is not an operation (others in Algorithm 5.7) doesn’t matter. As before, instead of inputting the complete instruction to the component, a control variable f equal to 0 in the case of an addition and to 1 in the case of a subtraction will be generated by the instruction decoder. This is the component specification: Algorithm 5.10 Arithmetic Unit if f ¼ 0 then result ¼ (left_in + right_in) mod 256; else result ¼ (left_in - right_in) mod 256; end if;

This component (Fig. 5.14) is a mod 256 adder/subtractor controlled by a control input f (Fig. 3.3 with n ¼ 8 and a/s ¼ f and without the output ovf). The following VHDL model uses the IEEE arithmetic packages. All commercial synthesis tools use those packages and would generate an efficient arithmetic unit. The package main_parameters has already been defined before (Sect. 5.4.1):

Fig. 5.14 Arithmetic unit

left_in right_in

f

add/ subtract

result

5.4

Component Implementation

153

library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all; use ieee.std_logic_unsigned.all; use work.main_parameters.all; entity computation_resources is port ( left_in, right_in: in std_logic_vector(m-1 downto 0); f: in std_logic; result: out std_logic_vector(m-1 downto 0) ); end computation_resources;architecture behavior of computation_resources is begin process(f, left_in, right_in) begin if f ¼ ’0’ then result OUT(i) ¼ A; when others ¼> null; end case; end loop;

The component inputs are out_en, out_sel, A, i (parameters included in the instruction), and reg, and the component outputs are OUT0 to OUT7 (Fig. 5.15a). An implementation is shown in Fig. 5.15b. The outputs OUT0 to OUT7 are stored in eight registers, each of them with a CEN (clock enable) control input. The clock signal is not represented in Fig. 5.15b. An address decoder and Table 5.5 Encoded instruction types (output instructions)

Instruction type DATA_OUTPUT OUTPUT_VALUE Others

out_en 1 1 0

out_sel 1 0 –

154

5

OUT0 OUT1 out_en i out_sel A

Synthesis of a Processor

reg

out_en

··· OUT7 i

A

EN0 EN1

out_sel

···

···

output selection

to_ports

EN7 EN0

0

CEN

EN1

1

··· EN7

CEN

CEN

··· reg a.

OUT0

OUT1

OUT7

b.

Fig. 5.15 Output selection implementation

a set of AND2 gates generate eight signals EN0 to EN7 that enable the clock signals of the output registers. Thus, the clock of the output register number s is enabled if s ¼ i and out_en ¼ 1. The value that is stored in the selected register is either A (if out_sel ¼ 0) or reg (if out_sel ¼ 1). If out_en ¼ 0 none of the clock signals is enabled so that the value of out_sel doesn’t matter (Table 5.5). The following VHDL model describes the circuit of Fig. 5.15b. Its architecture consists of four processes that describe the 3-to-8 address decoder, the set of eight AND2 gates, the 8-bit MUX2-1, and the set of output registers. The eight outputs of the address decoder are defined as vector DEC_OUT: library ieee; use ieee.std_logic_1164.all; use work.main_parameters.all; entity output_selection is port ( A, reg: in std_logic_vector(m-1 downto 0); clk, out_en, out_sel: in std_logic; i: in std_logic_vector(2 downto 0); OUT0, OUT1, OUT2, OUT3, OUT4, OUT5, OUT6, OUT7: out std_logic_vector(m-1 downto 0) ); end output_selection; architecture structure of output_selection is signal EN: std_logic_vector(0 to 7); signal DEC_OUT: std_logic_vector(0 to 7); signal to_ports: std_logic_vector(m-1 downto 0); begin decoder: process(i) begin case i is when "000" ¼> DEC_OUT DEC_OUT DEC_OUT DEC_OUT DEC_OUT left_out, right_in ¼> right_out, f ¼> instruction(12), result ¼> result); comp3: output_selection port map (A ¼> instruction(7 downto 0), reg ¼> right_out, clk ¼> clk, out_en ¼> out_en, out_sel ¼> instruction(13), i ¼> instruction(10 downto 8), OUT0 ¼> OUT0, OUT1 ¼> OUT1, OUT2 ¼> OUT2, OUT3 ¼> OUT3, OUT4 ¼> OUT4, OUT5 ¼> OUT5, OUT6 ¼> OUT6, OUT7 ¼> OUT7); comp4: register_bank port map (reg_in ¼> reg_in, clk ¼> clk, write_reg ¼> write_reg, i ¼> instruction(11 downto 8), j ¼> instruction(7 downto 4), k ¼> instruction(3 downto 0), left_out ¼> left_out, right_out ¼> right_out); comp5: go_to port map (N ¼> instruction(7 downto 0), data ¼> left_out, clk ¼> clk, reset ¼> reset, numb_sel ¼> instruction(15 downto 12), number ¼> number); --Boolean equations: out_en IN0, IN1 ¼> IN1, IN2 ¼> IN2, . . ., instruction ¼> instruction, clk ¼> clk, reset ¼> reset, OUT0 ¼> OUT0, OUT1 ¼> OUT1, . . . , number ¼> number); digital_clock

View more...
Digital Systems

From Logic Gates to Processors

Digital Systems

Jean-Pierre Deschamps • Elena Valderrama • Lluı´s Tere´s

Digital Systems From Logic Gates to Processors

Jean-Pierre Deschamps School of Engineering Rovira i Virgili University Tarragona, Spain

Elena Valderrama Escola d’Enginyeria Campus de la UAB Bellaterra, Spain

Lluı´s Tere´s Microelectronics Institute of Barcelona IMB-CNM (CSIC) Campus UAB-Bellaterra, Cerdanyola Barcelona, Spain

ISBN 978-3-319-41197-2 ISBN 978-3-319-41198-9 (eBook) DOI 10.1007/978-3-319-41198-9 Library of Congress Control Number: 2016947365 # Springer International Publishing Switzerland 2017 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Printed on acid-free paper This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG Switzerland

Preface

Digital electronic components are present in almost all our private and professional activities: • Our personal computers, our smartphones, or our tablets are made up of digital components such as microprocessors, memories, interface circuits, and so on. • Digital components are also present within our cars, our TV sets, or even in our household appliances. • They are essential components of practically any industrial production line. • They are also essential components of public transport systems, of secure access control systems, and many others. We could say that any activity involving: • The acquisition of data from human interfaces or from different types of sensors • The storage of data • The transmission of data • The processing of data • The use of data to control human interfaces or to control different types of actuators (e.g., mechanical actuators), can be performed in a safe and fast way by means of Digital Systems. Thus, nowadays digital systems constitute a basic technical discipline, essential to any engineer. That’s the reason why the Engineering School of the Autonomous University of Barcelona (UAB) has designed an introductory course entitled “Digital Systems: From Logic Gates to Processors,” available on the Coursera MOOC (Massive Open Online Course) platform. This book includes all the material presented in the above mentioned MOOC. Digital systems are constituted of electronic circuits made up (mainly) of transistors. A transistor is a very small device, similar to a simple switch. On the other hand, a digital component, like a microprocessor, is a very large circuit able to execute very complex operations. How can we build such a complex system (a microprocessor) using very simple building blocks v

vi

(the transistors)? The answer to this question is the central topic of a complete course on digital systems. This introductory course describes the basic methods used to develop digital systems, not only the traditional ones, based on the use of logic gates and flip-flops, but also more advanced techniques that permit to design very large circuits and are based on hardware description languages and simulation and synthesis tools. At the end of this course the reader: • Will have some idea of the way a new digital system can be developed, generally starting from a functional specification; in particular, she/he will be able to: – Design digital systems of medium complexity – Describe digital systems using a high-level hardware description language – Understand the operation of computers at their most basic level • Will know the main problems the development engineer is faced with, during the process of developing a new circuit • Will understand which design tools are necessary to develop a new circuit This course addresses (at least) two categories of people: on the one hand, people interested to know what a digital system is and how it can be developed and nothing else, but also people who need some knowledge about digital systems as a previous step toward other technical disciplines, such as computer architecture, robotics, bionics, avionics, and others.

Overview Chapter 1 gives a general definition of digital systems, presents generic description methods, and gives some information about the way digital systems can be implemented under the form of electronic circuits. Chapter 2 is devoted to combinational circuits, a particular type of digital circuit (memoryless circuit). Among others, it includes an introduction to Boolean algebra, one of the mathematical tools used to define the behavior of digital circuits. In Chap. 3, a particular type of circuit, namely, arithmetic circuits, is presented. Arithmetic circuits are present in almost any system so that they deserve some particular presentation. Furthermore, they constitute a first example of reusable blocks. Instead of developing systems from scratch, a common strategy in many technical disciplines is to reuse already developed parts. This modular approach is very common in software engineering and can also be considered in the case of digital circuits. As an example, think of building a multiplier using adders and one-digit multipliers. Sequential circuits, which are circuits including memory elements, are the topic of Chap. 4. Basic sequential components (flip-flops) and basic building blocks (registers, counters, memories) are defined. Synthesis methods are

Preface

Preface

vii

presented. In particular, the concept of finite state machines (FSM), a mathematical tool used to define the behavior of a sequential circuit, is introduced. As an example of the application of the synthesis methods described all along in the previous chapters, the design of a complete digital system is presented in Chap. 5. It is a generic system, able to execute a set of algorithms, depending on the contents of a memory block that stores a program. This type of system is called a processor, in this case a very simple one. The last two chapters are dedicated to more general considerations about design methods and tools (Chap. 6) and about physical implementations (Chap. 7). All along the course, a standard hardware description language, namely, VHDL, is used to describe circuits. A short introduction to VHDL is included in Appendix A. In order to define algorithms, a more informal and not executable language (pseudocode) is used. It is defined in Appendix B. Appendix C is an introduction to the binary numeration system used to represent numbers. Tarragona, Spain Bellaterra, Spain Barcelona, Spain

Jean-Pierre Deschamps Elena Valderrama Lluı´s Tere´s

Acknowledgments

The authors thank the people who have helped them in developing this book, especially Prof. Merce` Rulla´n who reviewed the text and is the author of Appendices B and C. They are grateful to the following institutions for providing them the means for carrying this work through to a successful conclusion: Autonomous University of Barcelona, National Center of Microelectronics (CSIC, Bellaterra, Spain), and University Rovira i Virgili (Tarragona, Spain).

ix

Contents

1

Digital Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Description Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Functional Description . . . . . . . . . . . . . . . . . . . . 1.2.2 Structural Description . . . . . . . . . . . . . . . . . . . . . 1.2.3 Hierarchical Description . . . . . . . . . . . . . . . . . . . 1.3 Digital Electronic Systems . . . . . . . . . . . . . . . . . . . . . . . 1.3.1 Real System Structure . . . . . . . . . . . . . . . . . . . . . 1.3.2 Electronic Components . . . . . . . . . . . . . . . . . . . . 1.3.3 Synthesis of Digital Electronic Systems . . . . . . . . 1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

1 1 4 4 7 8 10 10 11 18 18 20

2

Combinational Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Synthesis from a Table . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Boolean Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.2 Some Additional Properties . . . . . . . . . . . . . . . . . 2.3.3 Boolean Functions and Truth Tables . . . . . . . . . . 2.3.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Logic Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.4.1 NAND and NOR . . . . . . . . . . . . . . . . . . . . . . . . 2.4.2 XOR and XNOR . . . . . . . . . . . . . . . . . . . . . . . . 2.4.3 Tristate Buffers and Tristate Inverters . . . . . . . . . 2.5 Synthesis Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1 Redundant Terms . . . . . . . . . . . . . . . . . . . . . . . . 2.5.2 Cube Representation . . . . . . . . . . . . . . . . . . . . . . 2.5.3 Adjacency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.4 Karnaugh Map . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Propagation Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Other Logic Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.1 Multiplexers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.2 Multiplexers and Memory Blocks . . . . . . . . . . . . 2.7.3 Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7.4 Address Decoder and Tristate Buffers . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . .

21 21 22 27 27 30 31 34 35 35 37 41 42 42 45 47 48 50 55 55 58 60 60 xi

xii

Contents

2.8

Programming Language Structures . . . . . . . . . . . . . . . . . 2.8.1 If Then Else . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.2 Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.3 Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.4 Procedure Calls . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

62 62 63 63 65 66 66 67

3

Arithmetic Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Binary Adder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Binary Subtractor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Binary Adder/Subtractor . . . . . . . . . . . . . . . . . . . . . . . . 3.4 Binary Multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Binary Divider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

69 69 70 71 72 74 76 77

4

Sequential Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Introductory Example . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Explicit Functional Description . . . . . . . . . . . . . . . . . . . 4.3.1 State Transition Graph . . . . . . . . . . . . . . . . . . . . 4.3.2 Example of Explicit Description Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.3 Next State Table and Output Table . . . . . . . . . . . 4.4 Bistable Components . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1 1-Bit Memory . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2 Latches and Flip-Flops . . . . . . . . . . . . . . . . . . . . 4.5 Synthesis Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6 Sequential Components . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.1 Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.2 Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.3 Memories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7 Sequential Implementation of Algorithms . . . . . . . . . . . . 4.7.1 A First Example . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.2 Combinational vs. Sequential Implementation . . . . . . . . . . . . . . . . . . . . . . . . . 4.8 Finite-State Machines . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8.2 VHDL Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9 Examples of Finite-State Machines . . . . . . . . . . . . . . . . . 4.9.1 Programmable Timer . . . . . . . . . . . . . . . . . . . . . 4.9.2 Sequence Recognition . . . . . . . . . . . . . . . . . . . . . 4.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

79 79 80 83 83

. . . . . . . . . . . .

86 88 88 89 91 93 96 97 101 107 113 113

. . . . . . . . .

116 119 119 121 126 126 129 132 133

Contents

xiii

5

Synthesis of a Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 Design Strategy . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Functional Specification . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Instruction Types . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Structural Specification . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Component Specification . . . . . . . . . . . . . . . . . . 5.4 Component Implementation . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Input Selection Component . . . . . . . . . . . . . . . . . 5.4.2 Computation Resources . . . . . . . . . . . . . . . . . . . 5.4.3 Output Selection . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.4 Register Bank . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.5 Go To Component . . . . . . . . . . . . . . . . . . . . . . . 5.5 Complete Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 Instruction Encoding . . . . . . . . . . . . . . . . . . . . . . 5.5.2 Instruction Decoder . . . . . . . . . . . . . . . . . . . . . . 5.5.3 Complete Circuit . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

135 135 135 136 143 143 143 145 145 147 150 150 152 153 155 158 160 160 161 161 164 170

6

Design Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Structural Description . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 RTL Behavioral Description . . . . . . . . . . . . . . . . . . . . . 6.3 High-Level Synthesis Tools . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

171 171 172 175 177

7

Physical Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 Manufacturing Technologies . . . . . . . . . . . . . . . . . . . . . 7.2 Implementation Strategies . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Standard Cell Approach . . . . . . . . . . . . . . . . . . . 7.2.2 Mask Programmable Gate Arrays . . . . . . . . . . . . 7.2.3 Field Programmable Gate Arrays . . . . . . . . . . . . . 7.3 Synthesis and Physical Implementation Tools . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . .

179 179 184 184 185 185 188 188

Appendix A:

A VHDL Overview . . . . . . . . . . . . . . . . . . . . . . . . 189

Appendix B:

Pseudocode Guidelines for the Description of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

Appendix C:

Binary Numeration System . . . . . . . . . . . . . . . . . . 227

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

About the Authors

Jean-Pierre Deschamps received an M.S. degree in electrical engineering from the University of Louvain, Belgium, in 1967; a Ph.D. in computer science from the Autonomous University of Barcelona, Spain, in 1983; and a Ph.D. degree in electrical engineering from the Polytechnic School of Lausanne, Switzerland, in 1984. He worked in several companies and universities. His research interests include ASIC and FPGA design and digital arithmetic. He is the author of ten books and more than a hundred international papers. Elena Valderrama received an M.S. degree in physics from the Autonomous University of Barcelona (UAB), Spain, in 1975, and a Ph.D. in 1979. Later, in 2006, she got a degree in medicine from the same university. She is currently professor at the Microelectronics Department of the Engineering School of UAB. From 1980 to 1998, she was an assigned researcher in the IMB-CNM (CSIC), where she led several biomedical-related projects in which the design and integration of highly complex digital systems (VLSI) was crucial. Her current interests focus primarily on education, not only from the point of view of the professor but also in the management and quality control of engineering-related educational programs. Her research interests move around the biomedical applications of microelectronics. Lluı´s Tere´s received an M.S. degree in 1982 and a Ph.D. in 1986, both in computer sciences, from the Autonomous University of Barcelona (UAB). He is working in UAB since 1982 and in IMB-CNM (CSIC) since its creation in 1985. He is head of the Integrated Circuits and Systems (ICAS) group at IMB with research activity in the fields of ASICs, sensor signal interfaces, body-implantable monitoring systems, integrated N/MEMS interfaces, flexible platform-based systems and SoC, and organic/printed microelectronics. He has participated in more than 60 industrial and research projects. He is coauthor of more than 70 papers and 8 patents. He has participated in two spin-offs. He is also a part time assistant professor at UAB.

xv

1

Digital Systems

This first chapter divides up into three sections. The first section defines the concept of digital system. For that, the more general concept of physical system is first defined. Then, the particular characteristics of digital physical systems are presented. In the second section, several methods of digital system specification are considered. A correct and unambiguous initial system specification is a key aspect of the development work. Finally, the third section is a brief introduction to digital electronics.

1.1

Definition

As a first step, the more general concept of physical system is introduced. It is not easy to give a complete and rigorous definition of physical system. Nevertheless, this expression has a rather clear intuitive meaning, and some of their more important characteristics can be underlined. A physical system could be defined as a set of interconnected objects or elements that realize some function and are characterized by a set of input signals, a set of output signals, and a relation between input and output signals. Furthermore, every signal is characterized by • Its type, for example a voltage, a pressure, a temperature, and a switch state • A range of values, for example all voltages between 0 and 1.5 V and all temperatures between 15 and 25 C Example 1.1 Consider the system of Fig. 1.1. It controls the working of a boiler that is part of a room heating system and is connected to a mechanical selector that permits to define a reference temperature. A temperature sensor measures the ambient temperature. Thus, the system has two input signals • pos: the selector position that defines the desired ambient temperature (any value between 10 and 30 ) • temp: the temperature measured by the sensor and one output signal • onoff, with two possible values ON (start the boiler) and OFF (stop the boiler).

# Springer International Publishing Switzerland 2017 J.-P. Deschamps et al., Digital Systems, DOI 10.1007/978-3-319-41198-9_1

1

2

1

Fig. 1.1 Temperature control

Digital Systems

20 10

30

pos

temperature control

onoff

to boiler

temp

temperature sensor

The relation between inputs and output is defined by the following program in which half_degree is a previously defined constant equal to 0.5. Algorithm 1.1 Temperature Control loop if temp < pos – half_degree then onoff ¼ on; elsif temp > pos + half_degree then onoff ¼ off; end if; wait for 10 s; end loop;

This is a pseudo-code program. An introduction to pseudo-code is given in Appendix B. However, this piece of program is quite easy to understand, even without any previous knowledge. Actually, the chosen pseudo-code is a simplified (non-executable) version of VHDL (Appendix A). Algorithm 1.1 is a loop whose body is executed every 10 s: the measured temperature temp is compared with the desired temperature pos defined by the mechanical selector position; then • If temp is smaller than pos 0.5, then the boiler must get started so that the output signal onoff ¼ ON. • If temp is greater than pos + 0.5, then the boiler must be stopped so that the output signal onoff ¼ OFF. • If temp is included between pos 0.5 and pos + 0.5, then no action is undertaken and the signal onoff value remains unchanged. This is a functional specification including some additional characteristics of the final system. For example: The temperature updating is performed every 10 s, so that the arithmetic operations must be executed in less than 10 s, and the accuracy of the control is about 0.5 . As mentioned above, the type and range of the input and output signals must be defined. • The input signal temp represents the ambient temperature measured by a sensor. Assume that the sensor is able to measure temperatures between 0 and 50 . Then temp is a signal whose type is “temperature” and whose range is “0 to 50 .” • The input signal pos is the position of a mechanical selector. Assume that it permits to choose any temperature between 10 and 30 . Then pos is a signal whose type is “position” and whose range is “10–30.” • The output signal onoff has only two possible values. Its type is “command” and its range is {ON, OFF}.

1.1

Definition

3

ref

reset start stop

h

HOURS

MINUTES

SECONDS

TENTHS

time m compu s -tation t

Fig. 1.2 Chronometer Fig. 1.3 Time reference signal

0.1 s ref

Assume now that the sensor is an ideal one, able to measure the temperature with an infinite accuracy, and that the selector is a continuous one, able to define the desired temperature with an infinite precision. Then both signals temp and pos are real numbers whose ranges are [0, 50] and [10, 30], respectively. Those signals, characterized by a continuous and infinite range of values, are called analog signals. On the contrary, the range of the output signal onoff is a finite set {ON, OFF}. Signals whose range is a finite set (not necessary binary as in the case of onoff) are called digital signals or discrete signals. Example 1.2 Figure 1.2 represents the structure of a chronometer. • Three push buttons control its working. They generate binary (2-valued) signals reset, start, and stop. • A crystal oscillator generates a time reference signal ref (Fig. 1.3): it is a square wave signal whose period is equal to 0.1 s (10 Hz). • A time computation system computes the value of signals h (hours), m (minutes), s (seconds), and t (tenths of second). • Some graphical interface displays the values of signals h, m, s, and t. Consider the time computation block. It is a physical system (a subsystem of the complete chronometer) whose input signals are • reset, start, and stop that are generated by three push buttons; according to the state of the corresponding switch, their value belongs to the set {closed, open}. • ref is the signal generated by the crystal oscillator and is assumed to be an ideal square wave equal to either 0 or 1 V and whose output signals are • h belonging to the set {0, 1, 2, . . ., 23} • m and s belonging to the set {0, 1, 2, . . ., 59} • t belonging to the set {0, 1, 2, . . ., 9}

4

1

Digital Systems

The relation between inputs and outputs can be defined as follows (in natural language): • When reset is pushed down then h ¼ m ¼ s ¼ t ¼ 0. • When start is pushed down, the chronometer starts counting; h, m, s, and t represent the elapsed time in tenth of seconds. • When stop is pushed down, the chronometer stops counting; h, m, s, and t represent the latest elapsed time. In this example, all input and output signal values belong to finite sets. So, according to a previous definition, all input and output signals are digital. Systems whose all input and output signals are digital are called digital system.

1.2

Description Methods

In this section several specification methods are presented.

1.2.1

Functional Description

The relation between inputs and outputs of a digital system can be defined in a functional way, without any information about the internal structure of the system. Furthermore, a distinction can be made between explicit and implicit functional descriptions. Example 1.3 Consider again the temperature controller of Example 1.1, with two modifications: • The desired temperature (pos) is assumed to be constant and equal to 20 (pos ¼ 20). • The measured temperature has been discretized so that the signal temp values belong to the set {0, 1, 2, . . ., 50}. Then, the working of the controller can be described, in a completely explicit way, by Table 1.1 that associates to each value of temp the corresponding value of onoff: if temp is smaller than 20, then onoff ¼ ON; if temp is greater than 20, then onoff ¼ ON; if temp is equal to 20, then onoff keeps unchanged. The same specification could be expressed by the following program. Table 1.1 Explicit specification

temp 0 1 18 19 20 21 22 49 50

onoff ON ON ON ON unchanged OFF OFF OFF OFF

1.2

Description Methods

5

Algorithm 1.2 Simplified Temperature Control if temp < 20 then onoff ¼ on; elsif temp > 20 then onoff ¼ off; end if;

This type of description, by means of an algorithm, will be called “implicit functional description.” In such a simple example, the difference between Table 1.1 and Algorithm 1.2 is only formal; in fact it is the same description. In more complex systems, a completely explicit description (a table) could be unmanageable. Example 1.4 As a second example of functional specification consider a system (Fig. 1.4) that adds two 2-digit numbers. Its input signals are • x1, x0, y1, and y0 whose values belong to {0, 1, 2, . . ., 9} and its output signals are • z2 whose values belong to {0, 1}, and z1 and z0 whose values belong to {0, 1, 2, . . ., 9}. Digits x1 and x0 represent a number X belonging to the set {0, 1, 2, . . . , 99}; digits y1 and y0 represent a number Y belonging to the same set {0, 1, 2, . . . , 99}, and digits z2, z1, and z0 represent a number Z belonging to the set {0, 1, 2, . . . ,198} where 198 ¼ 99 + 99 is the maximum value of X + Y. An explicit functional specification is Table 1.2 that contains 10,000 rows! Another way to specify the function of a 2-digit adder is the following algorithm in which symbol/ stands for the integer division. Fig. 1.4 2-Digit adder

x1 x0 y1 y0

Table 1.2 Explicit specification of a 2-digit adder

x1 x0 00 00 00 01 01 01 99 99 99

2-digit adder

y1 y0 00 01 99 00 01 99 00 01 99

z2 z1 z0

z2 z1 z0 000 001 099 001 002 100 099 100 198

6

1

Digital Systems

Algorithm 1.3 2-Digit Adder X ¼ 10x1 + x0; Y ¼ 10y1 + y0; Z ¼ X + Y; z2 ¼ Z/100; z1 ¼ (Z - 100z2)/10; z0 ¼ Z - 100z2 - 10z1;

As an example, if x1 ¼ 5, x0 ¼ 7, y1 ¼ 7, and y0 ¼ 1, then X ¼ 105 + 7 ¼ 57. Y ¼ 10 7 + 1 ¼ 71. Z ¼ 57 + 71 ¼ 128. z2 ¼ 128/100 ¼ 1. z2 ¼ (128 100 1)/10 ¼ 28/10 ¼ 2. z3 ¼ 128 100 1 10.2 ¼ 8. At the end of the algorithm execution: X þ Y ¼ Z ¼ 100 z2 þ 10 z1 þ z0 : Table 1.2 and Algorithm 1.3 are functional specifications. The first is explicit, the second is implicit, and both are directly deduced from the initial unformal definition: x1 and x0 represent X, digits y1 and y0 represent Y, and z2, z1, and z0 represent Z ¼ X + Y. Another way to define the working of the 2-digit adder is to use the classical pencil and paper algorithm. Given two 2-digit numbers x1 x0 and y1 y0, • Compute s0 ¼ x1 + x0. • If s0 < 10 then z0 ¼ s0 and carry ¼ 0; in the contrary case (s0 10) then z0 ¼ s0 10 and carry ¼ 1. • Compute s1 ¼ y1 + y0 + carry. • If s1 < 10 then z1 ¼ s1 and z2 ¼ 0; in the contrary case (s1 10) then z1 ¼ s1 10 and z2 ¼ 1. Algorithm 1.4 Pencil and Paper Algorithm s0 ¼ x0 + y0; if s0 10 then z0 ¼ s0 ‐ 10; carry ¼ 1; else z0 ¼ s0; carry ¼ 0; end if; s1 ¼ x1 + y1 + carry; if s1 10 then z1 ¼ s1 ‐ 10; z2 ¼ 1; else z1 ¼ s1; z2 ¼ 0; end if;

As an example, if x1 ¼ 5, x0 ¼ 7, y1 ¼ 7, and y0 ¼ 1, then s0 ¼ 7 + 1 ¼ 8; s0 < 10 so that z0 ¼ 8; carry ¼ 0;

1.2

Description Methods

7 x

Fig. 1.5 1-Digit adder

carryOUT

y

1-digit adder

carryIN

z

s1 ¼ 5 + 7 + 0 ¼ 12; s1 10 so that z1 ¼ 12 10 ¼ 2; z2 ¼ 1; and thus 57 + 71 ¼ 128. Comment 1.1 Algorithm 1.4 is another implicit functional specification. However it is not directly deduced from the initial unformal definition as was the case of Table 1.2 and of Algorithm 1.3. It includes a particular step-by-step addition method and, to some extent, already gives some indication about the structure of the system (the subject of next Sect. 1.2.2). Furthermore, it could easily be generalized to the case of n-digit operands for any n > 2.

1.2.2

Structural Description

Another way to specify the relation between inputs and outputs of a digital system is to define its internal structure. For that, a set of previously defined and reusable subsystems called components must be available. Example 1.5 Assume that a component called 1-digit adder (Fig. 1.5) has been previously defined. Its input signals are • Digits x and y belonging to {0, 1, 2, . . ., 9} • carryIN 2 {0, 1} and its output signals are • z 2 {0, 1, 2, . . ., 9} • carryOUT 2 {0, 1} Every 1-digit adder component executes the operations that correspond to a particular step of the pencil and paper addition method (Algorithm 1.4): • Add two digits and an incoming carry. • If the obtained sum is greater than or equal to 10, subtract 10 and the outgoing carry is 1; in the contrary case the outgoing carry is 0. The following algorithm specifies its working.

8

1 x3

Fig. 1.6 4-Digit adder

z4

y3

x2

y2

x1

y1

Digital Systems x0

y0

1-digit adder

1-digit adder

1-digit adder

1-digit adder

z3

z2

z1

z0

0

Algorithm 1.5 1-Digit Adder s ¼ x + y + carryIN; if s 10 then z ¼ s ‐ 10; carryOUT ¼ 1; else z ¼ s; carryOUT ¼ 0; end if;

With this component, the structure of a 4-digit adder can be defined (Fig. 1.6). It computes the sum Z ¼ X + Y where X ¼ x3 x2 x1 x0 and Y ¼ y3 y2 y1 y0 are two 4-digit numbers and Z ¼ z4 z3 z2 z1 z0 is a 5-digit number whose most significant digit z4 is 0 or 1 (X + Y 9999 + 9999 ¼ 19,998). Comment 1.2 In the previous Example 1.5, four identical components (1-digit adders) are used to define a 4-digit adder by means of its structure (Fig. 1.6). The 1-digit adder in turn has been defined by its function (Algorithm 1.5). This is an example of 2-level hierarchical description. The first level is a diagram that describes the structure of the system, while the second level is the functional description of the components.

1.2.3

Hierarchical Description

Hierarchical descriptions with more than two levels can be considered. The following example describes a 3-level hierarchical description. Example 1.6 Consider a system that computes the sum z ¼ w + x + y where w, x, and y are 4-digit numbers. The maximum value of z is 9999 + 9999 + 9999 ¼ 29,997 that is a 5-digit number whose most significant digit is equal to 0, 1, or 2. The first hierarchical level (top level) is a block diagram with two different blocks (Fig. 1.7): a 4-digit adder and a 5-digit adder. The 4-digit adder can be divided into four 1-digit adders (Fig. 1.8) and the 5-digit adder can be divided into five 1-digit adders (Fig. 1.9). Figures 1.8 and 1.9 constitute a second hierarchical level. Finally, a 1-digit adder (Fig. 1.5) can be defined by its functional description (Algorithm 1.5). It constitutes a third hierarchical level (bottom level). Thus, the description of the system that computes z consists of three levels (Fig. 1.10). The lowest level is the functional description of a 1-digit adder. Assuming that 1-digit adder components are available, the system can be built with nine components. A hierarchical description could be defined as follows. • It is a set of interconnected blocks. • Every block, in turn, is described either by its function or by a set of interconnected blocks, and so on. • The final blocks correspond to available components defined by their function.

w

Fig. 1.7 Top level

x

4

y

4

4 4-digit adder 5 u

5-digit adder 5 z x3

Fig. 1.8 4-Digit adder

x2

y2

x1

y1

x0

y0

1-digit adder

1-digit adder

1-digit adder

1-digit adder

z3

z2

z1

z0

u4

w3 u3

w2 u 2

w1 u1

w0 u0

1-digit adder

1-digit adder

1-digit adder

1-digit adder

1-digit adder

z4

z3

z2

z1

z0

z4

Fig. 1.9 5-Digit adder

y3

0

w x

y

4 4

4

5 u

5 z 0 u4 x co

z

y ci

z4

w3 u3

w2 u2

w1 u1

w0 u0

x co

x co

x co

x co

z

y ci

z3

z

y ci

z2

z

y ci

z1

z

y ci

z0

0

u4

x3 y3

x2 y2

x1 y1

x0 y0

x co

x co

x co

x co

z

y ci

u3

s = x + y + ci; if s ≥ 10 then z = s–10; co = 1; else z = s; ci = 0; end if;

Fig. 1.10 Hierarchical description

z

y ci

u2

z

y ci

u1

z

y ci

u0

0

0

0

10

1

Digital Systems

Comments 1.3 Generally, the initial specification of a digital system is functional (a description of what the system does). In the case of very simple systems it could be a table that defines the output signal values in function of the input signal values. However, for more complex systems other specification methods should be used. A natural language description (e.g., in English) is a frequent option. Nevertheless, an algorithmic description (programing language, hardware description language, pseudo-code) could be a better choice: those languages have a more precise and unambiguous semantics than natural languages. Furthermore, programing language and hardware description language specifications can be compiled and executed, so that the initial specification can be tested. The use of algorithms to define the function of digital systems is one of the key aspects of this course. In other cases, the initial specification already gives some information about the way the system must be implemented (see Examples 1.5 and 1.6). In fact, the digital system designer work is the generation of a circuit made up of available components and whose behavior corresponds to the initial specification. Many times this work consists of successive refinements of an initial description: starting from an initial specification a (top level) block diagram is generated; then, every block is treated as a subsystem to which a more detailed block diagram is associated, and so on. The design work ends when all block diagrams are made up of interconnected components defined by their function and belonging to some available library of physical components (Chap. 7).

1.3

Digital Electronic Systems

The definition of digital system of Sect. 1.1 is a very general one and refers to any type of physical system whose input and output values belong to a finite set. In what follows, this course will focus on electronic systems.

1.3.1

Real System Structure

Most real digital systems include (Fig. 1.11) • Input devices such as sensors, keyboards, microphones, and communication receivers. • Output devices such as displays, motors, communication transmitters, and loudspeakers. Fig. 1.11 Structure of a real digital system

discrete electrical signals

keyboard

conversion

switches

conversion

sensors

conversion

receiver

conversion

···

···

digital electronic system

···

conversion

motor

conversion

display

conversion

transm.

···

···

1.3

Digital Electronic Systems

11

• Input converters that translate the information generated by the input devices to discrete electrical signals. • Output converters that translate discrete electrical signals into signals able to control the output devices. • A digital electronic circuit—the brain of the system—that generates output electrical data in function of the input electrical data. In Example 1.2, the input devices are three switches (push buttons) and a crystal oscillator, and the output device is a 7-digit display. The time computation block is an electronic circuit that constitutes the brain of the complete system. Thus, real systems consist of a set of input and output interfaces that connect the input and output devices to the kernel of the system. The kernel of the system is a digital electronic system whose input and output signals are discrete electrical signals. In most cases those input and output signals are binary encoded data. As an example, numbers can be encoded according to the binary numeration system and characters such as letters, digits, or some symbols can be encoded according to the standard ASCII codes (American Standard Code for Information Interchange).

1.3.2

Electronic Components

To build digital electronic systems, electronic components are used. In this section some basic information about digital electronic components is given. Much more complete and detailed information about digital electronics can be found in books such as Weste and Harris (2010) or Rabaey et al. (2003).

1.3.2.1 Binary Codification A first question: It has been mentioned above that, in most cases, the input and output signals are binary encoded data; but how are the binary digits (bits) 0 and 1 physically (electrically) represented? The usual solution consists in defining a low voltage VL, and a high voltage VH, and conventionally associating VL to bit 0 and VH to bit 1. The value of VL and VH depends on the implementation technology. In this section it is assumed that VL ¼ 0 V and VH ¼ 1 V. 1.3.2.2 MOS Transistors Nowadays, most digital circuits are made up of interconnected MOS transistors. They are very small devices and large integrated circuits contain millions of transistors. MOS transistors (Fig. 1.12a, b) have three terminals called S (source), D (drain), and G (gate). There are two types of transistors: n-type (Fig. 1.12a) and p-type (Fig. 1.12b) where n and p refer to the type of majority electrical charges (carriers) that can flow from terminal S (source) to terminal D (drain) under the control of the gate voltage: in an nMOS transistor the majority carriers are Fig. 1.12 MOS transistors

G

S

G

D

a.

S

D

b.

12

1

Digital Systems

electrons (negative charges) so that the current flows from D to S; in a pMOS transistor the majority carriers are holes (positive charges) so that the current flows from S to D. A very simplified model (Fig. 1.13) is now used to describe the working of an nMOS transistor: it works like a switch controlled by the transistor gate voltage. If the gate voltage VG is low (0 V) then the switch is open (Fig. 1.14a, b) and no current could flow. If the gate voltage VG is high (1 V) then the switch is closed (Fig. 1.14c, d) and VOUT tends to be equal to VIN. However, if VIN is high (1 V) then VOUT is not equal to 1 V (Fig. 1.14b). The maximum value of VOUT is VG VT where the threshold voltage VT is a characteristic of the implementation technology. It could be said that an nMOS transistor is a good switch for transmitting VL (Fig. 1.14c), but not a good switch for transmitting VH (Fig. 1.14d). A similar model can be used to describe the working of a pMOS transistor. If the gate voltage VG is high (1 V) then the switch is open (Fig. 1.15a, b) and no current could flow. If the gate voltage VG is low (0 V) then the switch is closed (Fig. 1.15c, d) and VOUT tends to be equal to VIN. However, if VIN is low (0 V) then VOUT is not equal to 0 V (Fig. 1.15b). Actually the minimum value of VOUT is VG + |VT| where the threshold voltage VT is a characteristic of the implementation technology. It could be said that a pMOS transistor is a good switch for transmitting VH (Fig. 1.15c), but not a good switch for transmitting VL (Fig. 1.15d).

VG

Fig. 1.13 Equivalent model

VG

~ ~ VIN

Fig. 1.14 nMOS switches

VOUT

VIN

VOUT

0V

1V open circuit

0V

a.

0V 0V

0V

1V

1V

1V

b.

Fig. 1.15 pMOS switches

c.

open circuit

d.

0V

1V 1V

1V

a.

open circuit

1V

c. 0V

1V open circuit 0V

< 1 V!

b.

> 0 V! 0V

d.

1.3

Digital Electronic Systems

13

1.3.2.3 CMOS Inverter By interconnecting several transistors, small components called logic gates can be implemented. The simplest one (Fig. 1.16) is the CMOS inverter, also called NOT gate. A CMOS inverter consists of two transistors: • A pMOS transistor whose source is connected to the high voltage VH (1 V), whose gate is connected to the circuit input and whose drain is connected to the circuit output. • An nMOS transistor whose source is connected to the low voltage VL (0 V), whose gate is connected to the circuit input and whose drain is connected to the circuit output. To analyze the working of this circuit in the case of binary signals, consider the two following input values: • If VIN ¼ 0 V then (Fig. 1.17a) according to the simplified model of Sect. 1.3.2.2, the nMOS transistor is equivalent to an open switch and the pMOS transistor is equivalent to a closed switch (a good switch for transmitting VH) so that VOUT ¼ 1 V. • If VIN ¼ 1 V then (Fig. 1.17b) the pMOS transistor is equivalent to an open switch and the nMOS transistor is equivalent to a closed switch (a good switch for transmitting VL) so that VOUT ¼ 0 V.

Fig. 1.16 CMOS inverter

1V

VIN

VOUT

0V

Fig. 1.17 Working of a CMOS inverter

1V

1V

0V

0V

1V

a. 0V

b.

14 Fig. 1.18 Inverter: behavior and logic symbol

1 IN

OUT

0

1

1

Digital Systems

IN

0

OUT

b.

a.

1V

Fig. 1.19 2-Input NAND gate (NAND2 gate) VIN1

VIN2

VOUT VIN1

VIN2

0V

The conclusion of this analysis is that, as long as only binary signals are considered, the circuit of Fig. 1.16 inverts the input signal: it transforms VL (0 V) into VH (1 V) and VH (1 V) into VL (0 V). In terms of bits, it transforms 0 into 1 and 1 into 0 (Fig. 1.18a). As long as only the logic behavior is considered (the relation between input bits and output bits), the standard inverter symbol of Fig. 1.18b is used.

1.3.2.4 Other Components With four transistors (Fig. 1.19) a 2-input circuit called NAND gate can be implemented. It works as follows: • If VIN1 ¼ VIN2 ¼ 1 V then both pMOS switches are open and both nMOS switches are closed so that they transmit VL ¼ 0 V to the gate output (Fig. 1.20a). • If VIN2 ¼ 0 V, whatever the value of VIN1, then at least one of the nMOS switches (connected in series) is open and at least one of the pMOS switches (connected in parallel) is closed, so that VH ¼ 1 V is transmitted to the gate output (Fig. 1.20b). • If VIN1 ¼ 0 V, whatever the value of VIN2, the conclusion is the same. Thus, the logic behavior of a 2-input NAND gate is given in Fig. 1.21a and the corresponding symbol is shown in Fig. 1.21b. The output of a 2-input NAND gate (NAND2) is equal to 0 if, and only if, both inputs are equal to 1. In all other cases the output is equal to 1.

1.3

Digital Electronic Systems

15

Fig. 1.20 NAND gate working

0V 1V

1V VIN1

0V

1V

1V

a. b. 0V

IN1 IN2

Fig. 1.21 2-Input NAND gate: behavior and symbol

OUT

0

0

1

0

1

1

1

0

1

1

1

0

IN1

OUT

IN2

b.

a.

1V

VIN1 IN1 IN 2

VIN2 VOUT VIN1

VIN2

OUT

0

0

1

0

1

0

1

0

0

1

1

0

IN1 IN2

OUT

c.

b. 0V

a. Fig. 1.22 NOR2 gate

Other logic gates can be defined and used as basic components of digital circuits. Some of them will now be mentioned. Much more complete information about logic gates can be found in classical books such as Floyd (2014) or Mano and Ciletti (2012). The circuit of Fig. 1.22a is a 2-input NOR gate (NOR2 gate). If VIN1 ¼ VIN2 ¼ 0 V, then both p-type switches are closed and both n-type switches are open, so that VH ¼ 1 V is transmitted to the gate output. In all other cases at least one of the p-type switches is open and at least one of the n-type

16

1

Digital Systems

switches is closed, so that VL ¼ 0 V is transmitted to the gate output. The logic behavior and the symbol of a NOR2 gate are shown in Fig. 1.22b, c. NAND and NOR gates with more than two inputs can be defined. The output of a k-input NAND gate is equal to 0 if, and only if, the k inputs are equal to 1. The corresponding circuit (similar to Fig. 1.19) has k p-type transistors in parallel and k n-type transistors in series. The output of a k-input NOR gate is equal to 1 if, and only if, the k inputs are equal to 0. The corresponding circuit (similar to Fig. 1.22) has k n-type transistors in parallel and k p-type transistors in series. The symbol of a 3-input NAND gate (NAND3 gate) is shown in Fig. 1.23a and the symbol of a 3-input NOR gate (NOR3 gate) is shown in Fig. 1.23b. The logic circuit of Fig. 1.24a consists of a NAND2 gate and an inverter. The output is equal to 1 if, and only if, both inputs are equal to 1 (Fig. 1.24b). It is a 2-input AND gate (AND2 gate) whose symbol is shown in Fig. 1.24c. The logic circuit of Fig. 1.25a consists of a NOR2 gate and an inverter. The output is equal to 0 if, and only if, both inputs are equal to 0 (Fig. 1.25b). It is a 2-input OR gate (OR2 gate) whose symbol is shown in Fig. 1.25c. AND and OR gates with more than two inputs can be defined. The output of a k-input AND gate is equal to 1 if, and only if, the k inputs are equal to 1, and the output of a k-input OR gate is equal to 0 if, and only if, the k inputs are equal to 0. For example, an AND3 gate can be implemented with a NAND3 gate and an inverter (Fig. 1.26a). Its symbol is shown in Fig. 1.26b. An OR3 gate can be implemented with a NOR3 gate and an inverter (Fig. 1.26c). Its symbol is shown in Fig. 1.26d.

Fig. 1.23 NAND3 and NOR3

IN 1 IN 2 IN 3

IN1 IN2 IN3

OUT

a.

b.

IN1 IN2

Fig. 1.24 AND2 gate IN 1 IN 2

OUT

OUT

a.

OUT

0

0

0

0

1

0

1

0

0

1

1

1

IN1 IN2

OUT

c.

b.

IN1 IN2

Fig. 1.25 OR2 gate IN1 IN2

OUT

a.

OUT

0

0

0

0

1

1

1

0

1

1

1

1

b.

IN1 IN2

OUT

c.

1.3

Digital Electronic Systems

Fig. 1.26 AND3 and OR3 gates

17

IN 1 IN 2 IN 3

IN1 IN2 IN3

OUT

OUT

a.

b.

IN 1 IN 2 IN 3

IN1 I N2 IN3

OUT

OUT

c.

Fig. 1.27 Buffer

d.

IN

OUT

IN

a.

OUT

b.

Fig. 1.28 3-State buffer C C IN

OUT

IN

OUT

b. C

a. Buffers are another type of basic digital components. The circuit of Fig. 1.27a, made up of two inverters, generates an output signal equal to the input signal. Thus, it has no logic function; it is a power amplifier. Its symbol is shown in Fig. 1.27b. The circuit of Fig. 1.28a is a 3-state buffer. It consists of a buffer, an inverter, a pMOS transistor, and an nMOS transistor. It has two inputs IN and C (control) and an output OUT. If C ¼ 0, then both switches (n-type and p-type) are open, so that the output OUT is disconnected from the input IN (floating state or high impedance state). If C ¼ 1, then both switches are closed, so that the output OUT is connected to the input IN through a good ( p-type) switch if IN ¼ 1 and through a good (ntype) switch if IN ¼ 0. The 3-state buffer symbol is shown in Fig. 1.28b. Other small-size components such as multiplexers, encoders, decoders, latches, flip flops, and others will be defined in the next chapters. To conclude this section about digital components, an example of larger size component is given. Figure 1.29a is the symbol of a read-only memory (ROM) that stores four 3-bit words. Its behavior is specified in Fig. 1.29b: with two address bits x1 and x0 one of the four stored words is selected and can be read from outputs z2, z1, and z0. More generally, a ROM with N address bits and M output bits stores 2N M-bit words (in total M2N bits).

18

1

Fig. 1.29 12-bit read-only memory (3 22-bit ROM) x1 x0

0 1 1 0

1 1 0 1

0 1 0 0

z2

z1

z0

a.

Digital Systems

x1

x0

z2

z1

z0

0

0

0

1

0

0

1

1

1

1

1

0

1

0

0

1

1

0

1

0

b.

1.3.3

Synthesis of Digital Electronic Systems

The central topic of this course is the synthesis of digital electronic systems. The problem can be stated in the following way. • On the one hand, the system designer has the specification of a system to be developed. Several specification methods have been proposed in Sect. 1.2. • On the other hand, the system designer has a catalog of available electronic components such as logic gates, memories, and others, and might have access to previously developed and reusable subsystems. Some of the more common electronic components have been described in Sect. 1.3.2. The designer work is the definition of a digital system that fulfils the initial specification and uses building blocks that belong to the catalog of available components or are previously designed subsystems. In a more formal way it could be said that the designer work is the generation of a hierarchical description whose final blocks are electronic components or reusable electronic subsystems.

1.4

Exercises

1. The working of the chronometer of Example 1.2 can be specified by the following program in which the condition ref_positive_edge is assumed to be true on every positive edge of signal ref. loop if reset ¼ ON then h ¼ 0; m ¼ 0; s ¼ 0; t ¼ 0; elsif start ¼ ON then while stop ¼ OFF loop if ref_positive_edge ¼ TRUE then update(h, m, s, t); end if; end loop; end if; end loop;

The update procedure updates the values of h, m, s, and t every time that there is a positive edge on ref, that is to say every tenth of second. Generate a pseudo-code program that defines the update procedure.

1.4

Exercises

19

2. Given two numbers X and Y ¼ y3 103 + y2 102 + y1 10 + y0, the product P ¼ X Y can be expressed as P ¼ y0 X + y1 X 10 + y2X 102 + y3 X 103. Generate a pseudo-code program based on the preceding relation to compute P. 3. Given two numbers X and Y ¼ y3 103 + y2 102 + y1 10 + y0, the product P ¼ X Y can be expressed as P ¼ (((y3 X) 10 + y2 X) 10 + y1 X) 10 + y0 X. Generate a pseudo-code program based on the preceding relation to compute P. 4. Analyze the working of the following circuit and generate a 16-row table that defines VOUT in function of VIN1, VIN2, VIN3, and VIN4. 1V

VIN1

VIN3

VIN2

VIN4

VOUT VIN1

VIN2

VIN3

VIN4

0V

5. Analyze the working of the following circuit and generate an 8-row table that defines VOUT in function of VIN1, VIN2, and VIN3. 1V

VIN 1

VIN 2

VIN3

VOUT

VIN 1

VIN3

VIN 2

0V

20

1

Digital Systems

References Floyd TL (2014) Digital fundamentals. Prentice Hall, Upper Saddle River Mano MMR, Ciletti MD (2012) Digital design. Prentice Hall, Boston Rabaey JM, Chandrakasan A, Nikolic B (2003) Digital integrated circuits: a design perspective. Prentice Hall, Upper Saddle River Weste NHE, Harris DM (2010) CMOS VLSI design: a circuit and systems perspective. Pearson, Boston

2

Combinational Circuits

Given a digital electronic circuit specification and a set of available components, how can the designer translate this initial specification to a circuit? The answer is the central topic of this course. In this chapter, an answer is given in the particular case of the combinational circuits.

2.1

Definitions

A switching function is a binary function of binary variables. In other words, an n-variable switching function associates a binary value, 0 or 1, to any n-component binary vector. As an example, in Fig. 1.29, z2, z1, and z0 are three 2-variable switching functions. A digital circuit that implements a set of switching functions in such a way that at any time the output signal values only depend on the input signal values at the same moment is called a combinational circuit. The important point of this definition is “at the same moment.” A combinational circuit with n inputs and m outputs is shown in Fig. 2.1. It implements m switching functions f i : f0, 1gn ! f0, 1g, i ¼ 0, 1, . . . , m 1: To understand the condition “at the same moment” an example of circuit that is not combinational is now given. Example 1.2 (A Non-combinational Circuit) Consider the temperature controller of Example 1.3 defined by Table 1.1 and substitute ON by 1 and OFF by 0. The obtained Table 2.1 does not define a combinational circuit: the knowledge that the current temperature is 20 does not permit to decide whether the output signal must be 0 or 1. To decide, it is necessary to know the previous value of the temperature. In other words, this circuit must have some kind of memory. Example 2.2 Consider the 4-bit adder of Fig. 2.2. Input bits x3, x2, x1, and x0 represent an integer X in binary numeration (Appendix C); input bits y3, y2, y1, and y0 represent another integer Y, input bit ci is an incoming carry; and output bits z4, z3, z2, z1, and z0 represent an integer Z. The relation between inputs and outputs is

# Springer International Publishing Switzerland 2017 J.-P. Deschamps et al., Digital Systems, DOI 10.1007/978-3-319-41198-9_2

21

22

2 Combinational Circuits

Fig. 2.1 n-Input m-output combinational circuit

x0 x1

combinational ··· circuit

xn-1

Table 2.1 Specification of a non-combinational circuit

y0 =f0 (x0, x1, ···, xn-1) y1= f1 (x0, x1, ···, xn-1) ··· ym-1 = fm-1 (x0, x1, ···, xn-1)

Temp 0 1 ... 18 19 20 21 22 ... 49 50

Fig. 2.2 4-Bit adder

Onoff 1 1 ... 1 1 Unchanged 0 0 ... 0 0

x3 x 2 x 1 x 0

co = z4

y3 y 2 y 1 y 0

4-bit adder

ci

z 3 z 2 z 1 z0

Z ¼ X þ Y þ ci : Observe that X and Y are 4-bit integers included within the range of 0–15, so that the maximum value of Z is 15 þ 15 þ 1 ¼ 31 that is a 5-bit number. Output z4 could also be used as an outgoing carry co. In this example, the value of the output bits only depends on the value of the input bits at the same time; it is a combinational circuit.

2.2

Synthesis from a Table

A completely explicit specification of a 4-bit adder (Fig. 2.2) is a table that defines five switching functions z4, z3, z2, z1, and z0 of nine variables x3, x2, x1, x0, y3, y2, y1, y0, and ci (Table 2.2). A straightforward implementation method consists in storing the Table 2.2 contents in a read-only memory (Fig. 2.3). The address bits are the input signals x3, x2, x1, x0, y3, y2, y1, y0, and ci and the stored words define the value of the output signals z4, z3, z2, z1, and z0. As an example, if the address bits are 100111001, so that x3x2x1x0 ¼ 1001, y3y2y1y0 ¼ 1100, and ci ¼ 1, then X ¼ 9, Y ¼ 12, and Z ¼ 9 þ 12 þ 1 ¼ 22, and the stored word is 10110 that is the binary representation of 22. Obviously this is a universal synthesis method: it can be used to implement any combinational circuit. The generic circuit of Fig. 2.1 can be implemented by the ROM of Fig. 2.4. However, in many

2.2

Synthesis from a Table

Table 2.2 Explicit specification of a 4-bit adder

Fig. 2.3 ROM implementation of a 4-bit adder

23

x3x2x1x0 0000 0000 0000 0000 0000 0000 ... 1001 ... 1111 1111

y3y2y1y0 0000 0000 0001 0001 0010 0010 ... 1100 ... 1111 1111

ci 0 1 0 1 0 1 ... 1 ... 0 1

x3 x2 x1 x0 y3 y2 y1 y0 ci

z4z3z2z1z0 00000 00001 00001 00010 00010 00011 ... 10110 ... 11110 11111

0 0 0 0 0 0

0 0 0 0 0 0

0 0 0 0 0 0 ··· 1 1 1 1 1 1

0 0 0 1 1 1

0 1 1 0 0 1

1 0 1 1

··· xn-1

fm-1 (x0, x1, ···, xn-1)

x0 x1

f1 (x0, x1, ···, xn-1)

Fig. 2.4 ROM implementation of a combinational circuit

f0 (x0, x1, ···, xn-1)

z4 z3 z2 z1 z0

··· y 0 y1

ym-1

cases this is a very inefficient implementation method: the ROM of Fig. 2.4 must store m 2n bits, generally a (too) big number. As an example the ROM of Fig. 2.3 stores 5 29 ¼ 2,560 bits. Instead of using a universal, but inefficient, synthesis method, a better option is to take advantage of the peculiarities of the system under development. In the case of the preceding Example 2.2 (Fig. 2.2), a first step is to divide the 4-bit adder into four 1-bit adders (Fig. 2.5). Each 1-bit adder is a combinational circuit that implements two switching functions z and d of three variables x, y, and c. Each 1-bit adder executes the operations that correspond to a particular step of the binary addition method (Appendix C). A completely explicit specification is given in Table 2.3.

24

2 Combinational Circuits

Fig. 2.5 Structure of a 4-bit adder

x3

y3

x2

x

z4

y 1-bit d c adder z

Fig. 2.6 ROM implementation of a 1-bit adder

x1

x d

z3

Table 2.3 Explicit specification of a 1-bit adder

y2

y 1-bit c adder z

y 1-bit c adder z

x0

x d

z2

x 0 0 0 0 1 1 1 1

y1

x d

z1

y 0 0 1 1 0 0 1 1

y0

y 1-bit c adder z

ci

z0

c 0 1 0 1 0 1 0 1

d 0 0 0 1 0 1 1 1

x y

c

z 0 1 1 0 1 0 0 1

0 0 0 1 0 1 1 1

0 1 1 0 1 0 0 1

d z

In this case, a ROM implementation could be considered (Fig. 2.6). This type of small ROM (eight 2-bit words in this example) is often called lookup table (LUT) and it is the method used in field programmable gate arrays (FPGA) to implement switching functions of a few variables (Chap. 7). Instead of a ROM, a table can also be implemented by means of logic gates (Sect. 1.3.2), for example AND gates, OR gates, and Inverters (or NOT gates). Remember that • The output of an n-input AND gate is equal to 1 if, and only if, its n inputs are equal to 1. • The output of an n-input OR gate is equal to 1 if, and only if, at least one of its n inputs is equal to 1. • The output of an inverter is equal to 1 if, and only if, its input is equal to 0. Define now a 3-input switching function p(x, y, c) as follows: p ¼ 1 if, and only if, x ¼ 1, y ¼ 0, and c ¼ 1 (Table 2.4).

2.2

Synthesis from a Table

25

Table 2.4 Explicit specification of p

x 0 0 0 0 1 1 1 1

y 0 0 1 1 0 0 1 1

Fig. 2.7 Implementation of p

Table 2.5 Explicit specification of p1, p2, p3, and p4

c 0 1 0 1 0 1 0 1

p 0 0 0 0 0 1 0 0

x y c

x 0 0 0 0 1 1 1 1

y 0 0 1 1 0 0 1 1

c 0 1 0 1 0 1 0 1

p1 0 0 0 1 0 0 0 0

p

p2 0 0 0 0 0 1 0 0

p3 0 0 0 0 0 0 1 0

p4 0 0 0 0 0 0 0 1

This function p is implemented by the circuit of Fig. 2.7: the output of the AND3 gate is equal to 1 if, and only if, x ¼ 1, c ¼ 1 and the inverter output is equal to 1, that is, if y ¼ 0. The function d of Table 2.3 can be defined as follows: d is equal to 1 if, and only if, one of the following conditions is true: x ¼ 0, y ¼ 1, c ¼ 1, x ¼ 1, y ¼ 0, c ¼ 1, x ¼ 1, y ¼ 1, c ¼ 0, x ¼ 1, y ¼ 1, c ¼ 1: The switching functions p1, p2, p3, and p4 can be associated to those conditions (Table 2.5). Actually, the function p of Table 2.4 is the function p2 of Table 2.5. Each function pi can be implemented in the same way as p (Fig. 2.7) as shown in Fig. 2.8. Finally, the function d can be defined as follows: d is equal to 1 if, and only if, one of the functions pi is equal to 1. The corresponding circuit is a simple OR4 gate (Fig. 2.9) and the complete circuit is shown in Fig. 2.10.

26 Fig. 2.8 Implementation of p1, p2, p3, and p4

2 Combinational Circuits

x y c

p1

x y c

p2

x y c

p3

x y c

p4

Fig. 2.9 Implementation of d

p1 p2 p3

d

p4

Fig. 2.10 Complete circuit

x y c x y c d x y c x y c

Fig. 2.11 Simplified circuit

x y c x y c d x y

Comment 2.1 The conditions implemented by functions p3 and p4 are x ¼ 1, y ¼ 1, c ¼ 0 and x ¼ 1, y ¼ 1, c ¼ 1, and can obviously be substituted by the simple condition x ¼ 1 and y ¼ 1, whatever the value of c. Thus, in Fig. 2.10 two of the four AND3 gates can be replaced by a single AND2 gate (Fig. 2.11). The synthesis with logic gates of z (Table 2.3) is left as an exercise. In conclusion, given a combinational circuit whose initial specification is a table, two possible options are:

2.3

Boolean Algebra

27

• To store the table contents in a ROM • To translate the table to a circuit made up of logic gates Furthermore, in the second case, some optimization must be considered: if the inverters are not taken into account, the circuit of Fig. 2.11 contains 4 logic gates and 11 logic gate inputs, while the circuit of Fig. 2.10 contains 5 logic gates and 16 logic gate inputs. In CMOS technology (Sect. 1.3.2) the number of transistors is equal to twice the number of gate inputs so that the latter could be used as a measure of the circuit complexity. A conclusion is that a tool that helps to minimize the number of gates and the number of gate inputs is necessary. It is the topic of the next section.

2.3

Boolean Algebra

Boolean algebra is a mathematical support used to specify and to implement switching functions. Only finite Boolean algebras are considered in this course.

2.3.1

Definition

A Boolean algebra B is a finite set over which two binary operations are defined: • The Boolean sum þ • The Boolean product Those operations must satisfy six rules (postulates). The Boolean sum and the Boolean product are internal operations: 8a and b 2 B: a þ b 2 B and a b 2 B:

ð2:1Þ

Actually, this postulate only emphasizes the fact that þ and are operations over B. The set B includes two particular (and different) elements 0 and 1 that satisfy the following conditions: 8a 2 B : a þ 0 ¼ a and a 1 ¼ a:

ð2:2Þ

In other words, 0 and 1 are neutral elements with respect to the sum (0) and with respect to the product (1). Every element of B has an inverse in B: 8 a 2 B, ∃ a 2 B such that a þ a ¼ 1 and a a ¼ 0:

ð2:3Þ

Both operations are commutative: 8a and b 2 B : a þ b ¼ b þ a and a b ¼ b a:

ð2:4Þ

Both operations are associative: 8a, b and c 2 B, a ðb cÞ ¼ ða bÞ c and a þ ðb þ cÞ ¼ ða þ bÞ þ c:

ð2:5Þ

28

2 Combinational Circuits

The product is distributive over the sum and the sum is distributive over the product: 8a, b and c 2 B, a ðb þ cÞ ¼ a b þ a c and a þ b c ¼ ða þ bÞ ða þ cÞ:

ð2:6Þ

Comment 2.2 Rules (2.1)–(2.6) constitute a set of symmetric postulates: given a rule, by interchanging sum and product, and 0 and 1, another rule is obtained: for example the fact that a þ 0 ¼ a implies that a 1 ¼ a, or the fact that a (b þ c) ¼ a b þ a c implies that a þ b c ¼ (a þ b) (a þ c). This property is called duality principle. The simplest example of Boolean algebra is the set B2 ¼ {0, 1} with the following operations (Table 2.6): • a þ b ¼ 1 if, and only if, a ¼ 1 or b ¼ 1, the OR function of a and b • a þ b ¼ 1 if, and only if, a ¼ 1 and b ¼ 1, the AND function of a and b • The inverse of a is 1 a It can easily be checked that all postulates are satisfied. As an example (Table 2.7) check that the product is distributive over the sum, that is, a (b þ c) ¼ a b þ a c. The relation between B2 and the logic gates defined in Sect. 1.3.2 is obvious: an AND gate implements a Boolean product, an OR gate implements a Boolean sum, and an inverter (NOT gate) implements an invert function (Fig. 2.12). Table 2.6 Operations over B2

a 0 0 1 1

aþb 0 1 1 1

b 0 1 0 1

ab 0 0 0 1

a¯ 1 1 0 0

Table 2.7 a (b þ c) ¼ a b þ a c a 0 0 0 0 1 1 1 1

b 0 0 1 1 0 0 1 1

Fig. 2.12 Logic gates and Boolean functions

c 0 1 0 1 0 1 0 1

bþc 0 1 1 1 0 1 1 1

a (b þ c) 0 0 0 0 0 1 1 1

a

ab 0 0 0 0 0 0 1 1

a+b

b

a b c

ac 0 0 0 0 0 1 0 1

a

a·b

b

a+b+c

a b c

abþbc 0 0 0 0 0 1 1 1

a

a

a·b·c

2.3

Boolean Algebra

29

a b

f

b

a

f

c

a c

a.

b.

Fig. 2.13 Two equivalent circuits

In fact, there is a direct relation between Boolean expressions and circuits. As an example, a consequence of the distributive property a b þ a c ¼ a (b þ c) is that the circuits of Fig. 2.13 implement the same switching function, say f. However, the circuit that corresponds to the Boolean expression a b þ a c (Fig. 2.13a) has three gates and six gate inputs while the other (Fig. 2.13b) includes only two gates and three gate inputs. This is a first (and simple) example of how Boolean algebra helps the designer to optimize circuits. Other finite Boolean algebras can be defined. Consider the set B2n ¼ {0, 1}n, that is, the set of all n 2 n-component binary vectors. It is a Boolean algebra in which product, sum, and inversion are component-wise operations: 8a ¼ ða0 ; a1 ; . . . ; an1 Þ and b ¼ ðb0 ; b1 ; . . . ; bn1 Þ 2 B2 n : a þ b ¼ ða0 þ b0 , a1 þ b1 , . . . , an1 þ bn1 Þ, a b ¼ ða0 b0 , a1 b1 , . . . , an1 bn1 Þ, a ¼ ða0 , a1 , . . . , an1 Þ: The neutral elements are 0 ¼ (0, 0, . . ., 0) and 1 ¼ (1, 1, . . ., 1). Another example is the set of all subsets of a finite set S. Given two subsets S1 and S2, their sum is S1 [ S2 (union), their product is S1 \ S2 (intersection), and the inverse of S1 is S\S1 (complement of S1 with respect to S). The neutral elements are the empty set Ø and S. If S has n elements the number of subsets of S is 2n. A third example, the most important within the context of this course, is the set of all n-variable switching functions. Given two switching functions f and g, functions f þ g, f g, and f are defined as follows: 8ðx0 ; x1 ; . . . ; xn1 Þ 2 B2 n : ðf þ gÞðx0 ; x1 ; . . . ; xn1 Þ ¼ f ðx0 ; x1 ; . . . ; xn1 Þ þ gðx0 ; x1 ; . . . ; xn1 Þ, ðf gÞðx0 ; x1 ; . . . ; xn1 Þ ¼ f ðx0 ; x1 ; . . . ; xn1 Þ gðx0 ; x1 ; . . . ; xn1 Þ, f ðx0 ; x1 ; . . . ; xn1 Þ ¼ f ðx0 , x1 , . . . , xn1 Þ: The neutral elements are the constant functions 0 and 1. Comment 2.3 Mathematicians have demonstrated that any finite Boolean algebra is isomorphic to B2m for some m > 0. In particular, the number of elements of any finite Boolean algebra is a power of 2. Consider the previous examples. 1. The set of subsets of a finite set S ¼ {s1, s2, . . ., sn} is a Boolean algebra isomorphic to B2n: associate to every subset S1 of S an n-component binary vector whose component i is equal to 1 if, and only if, si 2 S1, and check that the vectors that correspond to the union of two subsets, the

30

2 Combinational Circuits

intersection of two subsets, and the complement of a subset are obtained by executing the component-wise addition, the component-wise product, and the component-wise inversion of the associated n-component binary vectors. 2. The set of all n-variable switching functions is a Boolean algebra isomorphic to B2m with m ¼ 2n: associate a number i to each of the 2n elements of {0, 1}n (for example the natural number represented in binary numeration by this vector); then, associate to any n-variable switching function f a 2n-component vector whose component number i is the value of f at point i. In the case of functions d and z of Table 2.3, n ¼ 3, 2n ¼ 8, and the 8-component vectors that define d and z are (00010111) and (01101001), respectively.

2.3.2

Some Additional Properties

Apart from rules (2.1–2.6) several additional properties can be demonstrated and can be used to minimize Boolean expressions and to optimize the corresponding circuits. Properties 2.1 1. 0 ¼ 1 and 1 ¼ 0:

ð2:7Þ

2. Idempotence : 8a 2 B : a þ a ¼ a and a a ¼ a:

ð2:8Þ

3. 8a 2 B : a þ 1 ¼ 1 and a 0 ¼ 0:

ð2:9Þ

4. Inverse uniqueness : if a b ¼ 0, a þ b ¼ 1, a c ¼ 0 and a þ c ¼ 1 then b ¼ c:

ð2:10Þ

5. Involution : 8a 2 B : a ¼ a:

ð2:11Þ

6. Absorption law : 8a and b 2 B, a þ a b ¼ a and a ða þ bÞ ¼ a:

ð2:12Þ

7. 8a and b 2 B, a þ a b ¼ a þ b and a ða þ bÞ ¼ a b:

ð2:13Þ

8. de Morgan laws : 8a and b 2 B, a þ b ¼ a b and a b ¼ a þ b:

ð2:14Þ

9. Generalized de Morgan laws : 8a1 , a2 , . . . , an 2 B;

ð2:15Þ

a1 þ a2 þ . . . þ an ¼ a1 a2 . . . an and a1 a2 . . . an ¼ a1 þ a2 þ . . . þ an :

Proof 1. 0 ¼ 0 þ 0 ¼ 1 and 1 ¼ 1 1 ¼ 0: 2. a ¼ a þ 0 ¼ a þ ða aÞ ¼ ða þ aÞ ða þ aÞ ¼ ða þ aÞ 1 ¼ a þ a; a ¼ a 1 ¼ a ða þ aÞ ¼ ða aÞ þ ða aÞ ¼ ða aÞ þ 0 ¼ a a:

2.3

Boolean Algebra

31

3. a þ 1 ¼ a þ a þ a ¼ a þ a ¼ 1; a 0 ¼ a a a ¼ a a ¼ 0: 4. b ¼ b ða þ cÞ ¼ a b þ b c ¼ 0 þ b c ¼ a c þ b c ¼ ða þ bÞ c ¼ 1 c ¼ c: 5. Direct consequence of (4). 6. a þ a b ¼ a 1 þ a b ¼ a ð1 þ bÞ ¼ a 1 ¼ a; a ða þ bÞ ¼ a a þ a b ¼ a þ a b ¼ a: 7. a þ a b ¼ ða þ aÞ ða þ bÞ ¼ 1 ða þ bÞ ¼ a þ b; a ða þ bÞ ¼ ða aÞ þ ða bÞ ¼ 0 þ ða bÞ ¼ a b: 8. ða þ bÞ a b ¼ a a b þ b a b ¼ 0 b þ 0 a ¼ 0 þ 0 ¼ 0; ð a þ bÞ þ a b ¼ a þ b þ a b ¼ a þ b þ a ¼ b þ 1 ¼ 1 : 9. By induction.

2.3.3

Boolean Functions and Truth Tables

Tables such as Table 2.3 that defines two switching functions d and z are called truth tables. If f is an n-variable switching function then its truth table has 2n rows, that is, the number of different ncomponent vectors. In this section the relation between Boolean expressions, truth tables, and gate implementation of combinational circuits is analyzed. Given a Boolean expression, that is a well-constructed expression using variables and Boolean operations (sum, product, and inversion), a truth table can be defined. For that the value of the expression must be computed for every combination of variable values, in total 2n different combinations if there are n variables. Example 2.3 Consider the following Boolean expression that defines a 3-variable switching function f: f ða, b, cÞ ¼ b c þ a b: Define a table with as many rows as the number of combinations of values of a, b, and c, that is, 23 ¼ 8 rows, and compute the value of f that corresponds to each of them (Table 2.8).

Table 2.8 f ða, b, cÞ ¼ b c þ a b: abc 000 001 010 011 100 101 110 111

c 1 0 1 0 1 0 1 0

bc 0 0 1 0 0 0 1 0

a¯ 1 1 1 1 0 0 0 0

ab 0 0 1 1 0 0 0 0

f ¼bcþ ab 0 0 1 1 0 0 1 0

32

2 Combinational Circuits

Conversely, a Boolean expression can be associated to any truth table. For that, first define some new concepts. Definitions 2.1 1. A literal is a variable or the inverse of a variable. For example a, a, b, b, . . . are literals. 2. An n-variable minterm is a product of n literals such that each variable appears only once. For example, if n ¼ 3 then there are eight different minterms: m0 ¼ a b c, m1 ¼ a b c, m2 ¼ a b c, m3 ¼ a b c,

ð2:16Þ

m4 ¼ a b c, m5 ¼ a b c, m6 ¼ a b c, m7 ¼ a b c:

Their corresponding truth tables are shown in Table 2.9. Their main property is that to each minterm mi is associated one, and only one, combination of values of a, b, and c such that mi ¼ 1: m0 is equal to 1 if: and only if, abc ¼ 000, m1 is equal to 1 if: and only if, abc ¼ 001, m2 is equal to 1 if: and only if, abc ¼ 010, m3 is equal to 1 if: and only if, abc ¼ 011, m4 is equal to 1 if: and only if, abc ¼ 100, m5 is equal to 1 if: and only if, abc ¼ 101, m6 is equal to 1 if: and only if, abc ¼ 110, m7 is equal to 1 if: and only if, abc ¼ 111: In other words, mi ¼ 1 if, and only if, abc is equal to the binary representation of i. Consider now a 3-variable function f defined by its truth table (Table 2.10). From Table 2.9 it can be deduced that f ¼ m2 þ m3 þ m6, and thus (2.16) f ¼ a b c þ a b c þ a b c:

ð2:17Þ

More generally, the n-variable minterm mi(xn1, xn2, . . ., x0) is equal to 1 if, and only if, the value of xn1xn2 . . . x0 is the binary representation of i. Given a truth table that defines an n-variable switching function f(xn1, xn2, . . ., x0), this function is the sum of all minterms mi such that in1in2 . . . i0 is the binary representation of i and f(in1, in2, . . ., i0) ¼ 1. This type of representation of a switching function under the form of a Boolean sum of minterms (like (2.17)) is called canonical representation.

Table 2.9 3-Variable minterms abc 000 001 010 011 100 101 110 111

m0 1 0 0 0 0 0 0 0

m1 0 1 0 0 0 0 0 0

m2 0 0 1 0 0 0 0 0

m3 0 0 0 1 0 0 0 0

m4 0 0 0 0 1 0 0 0

m5 0 0 0 0 0 1 0 0

m6 0 0 0 0 0 0 1 0

m7 0 0 0 0 0 0 0 1

2.3

Boolean Algebra

33

Table 2.10 Truth table of f

Fig. 2.14 f ¼ a b c þ abcþ abc

abc 000 001 010 011 100 101 110 111

a b c

f 0 0 1 1 0 0 1 0

f

The relation between truth table and Boolean expression, namely canonical representation, has been established. From a Boolean expression, for example (2.17), a circuit made up of logical gates can be deduced (Fig. 2.14). Another example: the functions p1, p2, p3, and p4 of Table 2.5 are minterms of variables x, y, and c, and the circuit of Fig. 2.10 corresponds to the canonical representation of d. Assume that a combinational system has been specified by some functional description, for example an algorithm (an implicit functional description). The following steps permit to generate a logic circuit that implements the function. • Translate the algorithm to a table (an explicit functional description); for that, execute the algorithm for all combinations of the input variable values. • Generate the canonical representation that corresponds to the table. • Optimize the expression using properties of the Boolean algebras. • Generate the corresponding circuit made up of logic gates. As an example, consider the following algorithm that defines a 3-variable switching function. Algorithm 2.1 Specification of f(a, b, c)

if (a¼1 and b¼1 and c¼0) or (a¼0 and b¼1) then f ¼ 1; else f ¼ 0; end if;

34

2 Combinational Circuits

Fig. 2.15 Optimized circuit

a f

b

c

Fig. 2.16 Implementation of (2.20)

y c

d

x

By executing this algorithm for each of the eight combinations of values of a, b, and c, Table 2.10 is obtained. The corresponding canonical expression is (2.17). This expression can be simplified using Boolean algebra properties: a b c þ a b c þ a b c ¼ a b ðc þ cÞ þ ða þ aÞ b c ¼ a b þ b c: The corresponding circuit is shown in Fig. 2.15. It implements the same function as the circuit of Fig. 2.14, with fewer gates and fewer gate inputs. This is an example of the kind of circuit optimization that Boolean algebras permit to execute.

2.3.4

Example

The 4-bit adder of Sect. 2.2 is now revisited and completed. A first step is to divide the 4-bit adder into four 1-bit adders (Fig. 2.5). Each 1-bit adder implements two switching functions d and z defined by their truth tables (Table 2.3). The canonical expressions that correspond to the truth tables of d and z are the following: d ¼ x y c þ x y c þ x y c þ x y c;

ð2:18Þ

z ¼ x y c þ x y c þ x y c þ x y c:

ð2:19Þ

The next step is to optimize the Boolean expressions. Equation 2.18 can be optimized as follows: d ¼ ðx þ xÞ y c þ x ðy þ yÞ c þ x y ðc þ cÞ ¼ y c þ x c þ x y:

ð2:20Þ

2.4

Logic Gates

35

Fig. 2.17 Implementation of (2.19)

x

y

c

z

The corresponding circuit is shown in Fig. 2.16. It implements the same function d as the circuit of Fig. 2.11, with fewer gates, fewer gate inputs, and without inverters. Equation 2.19 cannot be simplified. The corresponding circuit is shown in Fig. 2.17.

2.4

Logic Gates

In Sects. 2.2 and 2.3 a first approach to the implementation of switching functions has been proposed. It is based on the translation of the initial specification to Boolean expressions. Then, circuits made up of AND gates, OR gates, and inverters can easily be defined. However, there exist other components (Sect. 1.3.2) that can be considered to implement switching functions.

2.4.1

NAND and NOR

NAND gates and NOR gates have been defined in Sect. 1.3.2. They can be considered as simple extensions of the CMOS inverter and are relatively easy to implement in CMOS technology. A NAND gate is equivalent to an AND gate and an inverter, and a NOR gate is equivalent to an OR gate and an inverter (Fig. 2.18). The truth tables of a 2-input NAND function and of a 2-input NOR function are shown in Figs. 1.21a and 1.22b, respectively. More generally, the output of a k-input NAND gate is equal to 0 if, and only if, the k inputs are equal to 1, and the output of a k-input NOR gate is equal to 1 if, and only if, the k inputs are equal to 0. Thus, NANDðx1 , x2 , . . . , xn Þ ¼ x1 x2 . . . xn ¼ x1 þ x2 þ . . . þ xn ;

ð2:21Þ

NORðx1 , x2 , . . . , xn Þ ¼ x1 þ x2 þ . . . þ xn ¼ x1 x2 . . . xn :

ð2:22Þ

Sometimes, the following algebraic symbols are used:

36

2 Combinational Circuits

Fig. 2.18 NAND2 and NOR2 symbols and equivalent circuits

a

NAND(a, b)

b a

NOR(a, b)

b

Fig. 2.19 NOT, AND2, and OR2 gates implemented with NAND2 gates and inverters

a 1

a

a

a

NAND(a, b)

b a

NOR(a, b)

b

a

a

b

a·b

a a·b = a+b b

a " b ¼ NANDða; bÞ and a # b ¼ NORða; bÞ: NAND and NOR gates are universal modules. That means that any switching function can be implemented only with NAND gates or only with NOR gates. It has been seen in Sect. 2.3 that any switching function can be implemented with AND gates, OR gate, and inverters (NOT gates). To demonstrate that NAND gates are universal modules, it is sufficient to observe that the AND function, the OR function, and the inversion can be implemented with NAND functions. According to (2.21) x1 x2 . . . xn ¼ x1 x2 . . . xn ¼ NANDðx1 , x2 , . . . , xn Þ,

ð2:23Þ

x1 þ x2 þ . . . þ xn ¼ NANDðx1 , x2 , . . . , xn Þ;

ð2:24Þ

x ¼ x 1 ¼ NANDðx; 1Þ ¼ x x ¼ NANDðx; xÞ:

ð2:25Þ

As an example NOT, AND2, and NOR2 gates implemented with NAND2 gates are shown in Fig. 2.19. Similarly, to demonstrate that NOR gates are universal modules, it is sufficient to observe that the AND function, the OR function, and the inversion can be implemented with NOR functions. According to (2.22) x1 þ x2 þ . . . þ xn ¼ x1 þ x2 þ . . . þ xn ¼ NORðx1 , x2 , . . . , xn Þ,

ð2:26Þ

x1 x2 . . . xn ¼ NORðx1 , x2 , . . . , xn Þ;

ð2:27Þ

x ¼ x þ 0 ¼ NORðx; 0Þ ¼ x þ x ¼ NORðx; xÞ:

ð2:28Þ

Example 2.4 Consider the circuit of Fig. 2.11. According to (2.23) and (2.24), the AND gates and the OR gate can be substituted by NAND gates. The result is shown in Fig. 2.20a. Furthermore, two serially connected inverters can be substituted by a simple connection (Fig. 2.20b). Comments 2.4 1. Neither the 2-variable NAND function (NAND2) nor the 2-variable NOR function (NOR2) are associative operations. For example

2.4

Logic Gates

37

Fig. 2.20 Circuits equivalent to Fig. 2.11

x y ci x

a. y

ci c0

x y

x y ci x

b. y

ci c0

x y

Fig. 2.21 XOR gate and XNOR gate symbols

a b

XOR(a, b)

a b

XNOR(a, b)

NANDða, NANDðb; cÞÞ ¼ a þ NANDðb; cÞ ¼ a þ b c, NANDðNANDða; bÞ, cÞ ¼ a b þ c; and none of the previous functions is equal to NANDða; b; cÞ ¼ a þ b þ c. 2. As already mentioned above, NAND gates and NOR gates are easy to implement in CMOS technology. On the contrary, AND gates and OR gates must be implemented by connecting a NAND gate and an inverter or a NOR gate and an inverter, respectively. Thus, within a CMOS integrated circuit, NAND gates and NOR gates use less silicon area than AND gates and OR gates.

2.4.2

XOR and XNOR

XOR gates, where XOR stands for eXclusive OR, and XNOR gates are other commonly used components, especially in arithmetic circuits. The 2-variable XOR switching function is defined as follows:

38

2 Combinational Circuits

Table 2.11 XOR and XNOR truth tables

Fig. 2.22 3-Input and 4-input XOR gates and XNOR gates

a b c

ab 00 01 10 11

XOR(a, b) 0 1 1 0

a b c

XOR(a, b, c)

a

a

b

b

XOR(a, b, c, d)

c

a b

XNOR(a, b, c)

XNOR(a, b, c, d)

c

d

XNOR(a, b) 1 0 0 1

d

XOR(a, b, c, d)

a b

XNOR(a, b, c, d)

c d

c d

a.

b.

Fig. 2.23 4-Input XOR and XNOR gates implemented with 2-input gates

XORða; bÞ ¼ 1 if, and only if, a 6¼ b; and the 2-variable XNOR switching function is the inverse of the XOR function, so that XNORða; bÞ ¼ 1 if, and only if, a ¼ b: Their symbols are shown in Fig. 2.21 and their truth tables are defined in Table 2.11. The following algebraic symbols are used: a b ¼ XORða; bÞ, ab ¼ XNORða; bÞ: An equivalent definition of the XOR function is XORða; bÞ ¼ ða þ bÞmod2 ¼ a b: With this equivalent definition an n-variable XOR switching function can be defined for any n > 2: XORða1 ; a2 ; . . . ; an Þ ¼ ða1 þ a2 þ . . . þ an Þmod2 ¼ a1 a2 . . . an ; and the n-variable XNOR switching function is the inverse of the XOR function: XNORða1 ; a2 ; . . . ; an Þ ¼ XORða1 , a2 , . . . , an Þ: Examples of XOR gate and XNOR gate symbols are shown in Fig. 2.22. Mod 2 sum is an associative operation, so that n-input XOR gates can be implemented with 2-input XOR gates. As an example, in Fig. 2.23a a 4-input XOR gate is implemented with three 2-input XOR gates.

2.4

Logic Gates

39

An n-input XNOR gate is implemented by the same circuit as an n-input XOR gate in which the XOR gate that generates the output is substituted by an XNOR gate. In Fig. 2.23b a 4-input XNOR gate is implemented with two 2-input XOR gates and a 2-input XNOR gate. XOR gates and XNOR gates are not universal modules. However they are very useful to implement arithmetic functions. Example 2.5 As a first example consider a 4-bit magnitude comparator: given two 4-bit numbers a ¼ a3a2a1a0 and b ¼ b3b2b1b0 generate a switching function comp equal to 1 if, and only if, a ¼ b. The following trivial algorithm is used:

if (a3 ¼ b3) and (a2 ¼ b2) and (a1 ¼ b1) and (a0 ¼ b0) then comp ¼ 1; else comp ¼ 0; end if;

The corresponding circuit is shown in Fig. 2.24: comp ¼ 1 if, and only if, the four inputs of the NOR4 gate are equal to 0, that is, if ai ¼ bi and thus XOR(ai, bi) ¼ 0, 8i ¼ 0, 1, 2, and 3.

Fig. 2.24 4-Bit magnitude comparator

a3 b3 comp a2 b2 a1 b1 a0 b0

transmission d0 d1 d2 d3 d4 d5 d6 d7 d8

data source

8-bit parity bit generator

Fig. 2.25 Transmission of 8-bit data

data destination

9-bit parity bit generator

error

40

2 Combinational Circuits d0 d1 d2 d3

d0 d1

d8

d4 d5

d4 d5

d6 d7

d6 d7

error

d8

d2 d3

Fig. 2.26 Parity bit generation and parity check

Fig. 2.27 1-Bit adder

xy

xy

d

c

a.

d

c

b. z

z

Example 2.6 The second example is a parity bit generator. It implements an n-variable switching function parity(a0, a1, . . ., an1) ¼ 1 if, and only if, there is an odd number of 1s among variables a0, a1, . . ., an1. In other words, parityða0 ; a1 ; . . . ; an1 Þ ¼ ða0 þ a1 þ . . . þ an1 Þ mod 2 ¼ a0 a1 . . . an1 : Consider a communication system (Fig. 2.25) that must transmit 8-bit data d ¼ d0d1. . .d7 from a data source circuit to a data destination circuit. On the source size, an 8-bit parity generator generates an additional bit d8 ¼ d0 d1 . . . d7, and the nine bits d0d1. . .d7d8 are transmitted. Thus, the number of 1s among the transmitted bits d0, d1, . . ., d8 is always even. On the destination side, a 9-bit parity generator checks whether the number of 1s among d0, d1, . . ., d8 is even, or not. If even, the parity generator output is equal to 0; if odd, the output is equal to 1. If it is assumed that during the transmission at most one bit could have been modified, due to the noise on the transmission lines, the 9-bit parity generator output is an error signal equal to 0 if no error has happened and equal to 1 in the contrary case. An 8-bit parity generator and a 9-bit parity generator implemented with XOR2 gates are shown in Fig. 2.26. Example 2.7 The most common use of XOR gates is within adders. A 1-bit adder implements two switching functions z and d defined by Table 2.3 and by (2.19) and (2.20). According to Table 2.3, z can also be expressed as follows: z ¼ ðx þ y þ cÞmod2 ¼ x y c:

ð2:29Þ

On the other hand, d is equal to 1 if, and only if, x þ y þ c 2. This condition can be expressed in the following way: either x ¼ y ¼ 1 or c ¼ 1 and x 6¼ y. The corresponding Boolean expression is

2.4

Logic Gates

41

Fig. 2.28 Tristate buffer and tristate inverter symbols

c

c

x

y

Table 2.12 Definition of tristate buffer and tristate inverter

cx 00 01 10 11

Fig. 2.29 Symbols of tristate components with active-low control input

3-State buffer output y Z Z 0 1

3-State inverter output y Z Z 1 0

c

c

x

Table 2.13 Definition of tristate components with active-low control input

y

x

y

cx 00 01 10 11

3-State buffer output y 0 1 Z Z

d ¼ x y þ c ðx yÞ:

x

y

3-State inverter output y 1 0 Z Z

ð2:30Þ

The circuit that corresponds to (2.29) and (2.30) is shown in Fig. 2.27a. As mentioned above (Fig. 2.20), AND gates and OR gates can be implemented with NAND gates (Fig. 2.27b).

2.4.3

Tristate Buffers and Tristate Inverters

Tristate buffers and tristate inverters are components whose output can be in three different states: 0 (low voltage), 1 (high voltage), or Z (disconnected). A tristate buffer CMOS implementation is shown in Fig. 1.28a: when the control input c ¼ 0, the output is disconnected from the input, so that the output impedance is very high (infinite if leakage currents are not considered); if c ¼ 1, the output is connected to the input through a CMOS switch. A tristate inverter is equivalent to an inverter whose output is connected to a tristate buffer. It works as follows: when the control input c ¼ 0, the output is disconnected from the input; if c ¼ 1, the output is equal to the inverse of the input. The symbols of a tristate buffer and of a tristate inverter are shown in Fig. 2.28 and their working is defined in Table 2.12.

42

2 Combinational Circuits

circuit B

circuit A

cA

circuit C

cB HAB.

HAB.

HAB.

HAB.

HAB.

HAB.

HAB.

HAB.

Fig. 2.30 4-Bit bus

Table 2.14 4-Bit bus definition

cA cB 00 01 10 11

Data transmission None B!C A!C Not allowed

In some tristate components the control signal c is active at low level. The corresponding symbols and definitions are shown in Fig. 2.29 and Table 2.13. A typical application of tristate components is shown in Fig. 2.30. It is a 4-bit bus that permits to send 4-bit data either from circuit A to circuit C or from circuit B to circuit C. As an example, A could be a memory, B an input interface, and C a processor. Both circuits A and B must be able to send data to C but cannot be directly connected to C. To avoid collisions, 3-state buffers are inserted between A and B outputs and the set of wires connected to the circuit C inputs. To transmit data from A to C, cA ¼ 1 and cB ¼ 0, and to transmit data from B to C, cA ¼ 0 and cB ¼ 1 (Table 2.14).

2.5

Synthesis Tools

In order to efficiently implement combinational circuits, synthesis tools are necessary. In this section, some of the principles used to optimize combinational circuits are described.

2.5.1

Redundant Terms

When defining a switching function it might be that for some combinations of input variable values the corresponding output value is not defined because either those input value combinations never happen or because the function value does not matter. In the truth table, the corresponding entries are named “don’t care” (instead of 0 or 1). When defining a Boolean expression that describes the switching function to be implemented, the minterms that correspond to those don’t care entries can be used, or not, in order to optimize the final circuit.

2.5

Synthesis Tools

43

Fig. 2.31 BCD to 7-segment decoder

A

x3 x2

BCD to 7 segments

x1 x0

F

A B C D E F G

B

G

E

C

D

Table 2.15 BCD to 7-segment decoder definition Digit 0 1 2 3 4 5 6 7 8 9

x3x2x1x0 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

A 1 0 1 1 0 1 1 1 1 1 – – – – – –

B 1 1 1 1 1 0 0 1 1 1 – – – – – –

C 1 1 0 1 1 1 1 1 1 1 – – – – – –

D 1 0 1 1 0 1 1 0 1 0 – – – – – –

E 1 0 1 0 0 0 1 0 1 0 – – – – – –

F 1 0 0 0 1 1 1 0 1 1 – – – – – –

G 0 0 1 1 1 1 1 0 1 1 – – – – – –

Example 2.8 A BCD to 7-segment decoder (Fig. 2.31) is a combinational circuit with four inputs x3, x2, x1, and x0 that are the binary representation of a decimal digit (BCD means binary coded decimal) and seven outputs that control the seven segments of a display. Among the 16 combinations of x3, x2, x1, and x0 values, only 10 are used: those that correspond to digits 0–9. Thus, the values of outputs A to G that correspond to inputs 1010 to 1111 are unspecified (don’t care). The BCD to 7-segment decoder is defined by Table 2.15. If all don’t care entries are substituted by 0s, the following set of Boolean expressions is obtained: A ¼ x3 :x1 þ x3 :x2 :x0 þ x3 :x2 :x1 ;

ð2:31aÞ

B ¼ x3 :x2 þ x2 :x1 þ x3 :x1 :x0 þ x3 :x1 :x0 ;

ð2:31bÞ

C ¼ x2 :x1 þ x3 :x0 þ x3 : x2 ;

ð2:31cÞ

D ¼ x2 : x1 :x0 þ x3 :x2 :x1 þ x3 :x1 :x0 þ x3 :x2 :x1 :x0 ;

ð2:31dÞ

44

2 Combinational Circuits

Table 2.16 Another definition of function B

x3x2x1x0 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111

B 1 1 1 1 1 0 0 1 1 1 1 1 1 0 0 1

E ¼ x2 :x1 :x0 þ x3 :x1 :x0 ;

ð2:31eÞ

F ¼ x3 :x1 :x0 þ x3 :x2 :x1 þ x3 :x2 :x0 þ x3 :x2 :x1 ;

ð2:31fÞ

G ¼ x3 :x2 :x1 þ x3 :x2 :x1 þ x3 :x2 :x0 þ x3 :x2 :x1 :

ð2:31gÞ

For example, B can be expressed as the sum of minterms m0, m1, m2, m3, m4, m7, m8, and m9: B ¼ x3 :x2 :x1 :x0 þ x3 :x2 :x1 :x0 þ x3 :x2 :x1 :x0 þ x3 :x2 :x1 :x0 þ x3 :x2 : x1 :x0 þ x3 :x2 :x1 :x0 þ x3 : x2 :x1 :x0 þ x3 : x2 :x1 :x0 : Then, the previous expression can be minimized: B ¼ x3 :x2 :ðx1 :x0 þ x1 :x0 þ x1 :x0 þ x1 :x0 Þ þ x3 :x2 : x1 :x0 þ x3 :x2 :x1 :x0 þ x3 :x2 :x1 :ðx0 þ x0 Þ ¼ x3 :x2 þ x3 :x2 : x1 :x0 þ x3 :x2 :x1 :x0 þ x3 :x2 :x1 ¼ x3 :x2 þ x3 :x2 :x1 :x0 þ x3 :x2 :x1 :x0 þ x3 :x2 :x1 þ x3 :x2 : x1 :x0 þ

ð2:32Þ

x3 :x2 :x1 :x0 þ x3 :x2 :x1 ¼ x3 :x2 þ x3 :ðx2 þ x2 Þ: x1 :x0 þ x3 :ðx2 þ x2 Þ:x1 :x0 þ ðx3 þ x3 Þ:x2 :x1 ¼ x3 :x2 þ x3 :x1 :x0 þ x3 :x1 :x0 þ x2 :x1 : By performing the same type of optimization for all other functions, the set of (2.31) has been obtained. In Table 2.16 the don’t care entries of function B have been defined in another way and a different Boolean expression is obtained: according to Table 2.16, B ¼ 1 if, and only if, x2 ¼ 0 or x1x0 ¼ 00 or 11; thus B ¼ x2 þ x1 :x0 þ x1 :x0 :

ð2:33Þ

Equations 2.32 and 2.33 are compatible with the initial specification (Table 2.15). They generate different values of B when x3x2x1x0 ¼ 1010, 1011, 1100, or 1111, but in those cases the value of B

2.5

Synthesis Tools

45

Table 2.17 Comparison between (2.31) and (2.34)

Gate type AND2 AND3 AND4 OR2 OR3 OR4 NOT

Number of gates (2.31) 6 17 1 1 2 4 4

Number of gates (2.34) 14 1 1 2 4 4

does not matter. On the other hand (2.33) is simpler than (2.32) and would correspond to a better implementation. By performing the same type of optimization for all other functions, the following set of expressions has been obtained: A ¼ x1 þ x2 :x0 þ x3 ;

ð2:34aÞ

B ¼ x2 þ x1 :x0 þ x1 :x0 ;

ð2:34bÞ

C ¼ x1 þ x0 þ x2 ;

ð2:34cÞ

D ¼ x2 :x0 þ x2 :x1 þ x1 :x0 þ x2 :x1 :x0 ;

ð2:34dÞ

E ¼ x2 :x0 þ x1 :x0 ;

ð2:34eÞ

F ¼ x1 :x0 þ x2 :x1 þ x2 :x0 þ x3 ;

ð2:34fÞ

G ¼ x2 :x0 þ x2 :x1 þ x1 :x0 þ x3 :

ð2:34gÞ

To summarize: • If the “don’t care” of Table 2.15 are replaced by 0s, the set of (2.31) is obtained. • If they are replaced by either 0 or 1, according to some optimization method (not described in this course), the set of (2.34) would have been obtained. In Table 2.17 the numbers of AND, OR, and NOT gates necessary to implement (2.31) and (2.34) are shown. The circuit that implements (2.31) has 6 2 þ 17 3 þ 1 4 þ 1 2 þ 2 3 þ 4 4 þ 4 1 ¼ 95 gate inputs and the circuit that implements (2.34) has 14 2 þ 1 3 þ 1 2 þ 2 3 þ 4 4 þ 4 1 ¼ 59 gate inputs. Obviously, the second circuit is better.

2.5.2

Cube Representation

A combinational circuit synthesis tool is a set of programs that generates optimized circuits, according to some criteria (cost, delay, power) starting either from logic expressions or from tables. The cube representation of combinational functions is an easy way to define Boolean expressions within a computer programming environment. The set of n-component binary vectors B2n can be considered as a cube (actually a hypercube) of dimension n. For example, if n ¼ 3, the set B23 of 3-component binary vectors is represented by the cube of Fig. 2.32a.

46

2 Combinational Circuits

Fig. 2.32 Cubes

110 010

010

011

100 000

110

111 011

100

101 001

000

001

a.

b.

Fig. 2.33 Solutions of x2 x0 ¼ 1

c.

1100 0100

1110 0110

A subset of B2n defined by giving a particular value to m vector components is a subcube B2nm of dimension n-m. As an example, the subset of vectors of B23 whose first coordinate is equal to 0 (Fig. 2.32b) is a cube of dimension 2 (actually a square). Another example: the subset of vectors of B23 whose first coordinate is 1 and whose third coordinate is 0 (Fig. 2.32c) is a cube of dimension 1 (actually a straight line). Consider a 4-variable function f defined by the following Boolean expression: f ðx3 ; x2 ; x1 ; x0 Þ ¼ x2 x0 : This function is equal to 1 if, and only if, x2 ¼ 1 and x0 ¼ 0, that is f ¼ 1 iff ðx3 ; x2 ; x1 ; x0 Þ 2 x 2 B2 4 x2 ¼ 1 and x0 ¼ 0 : In other words, f ¼ 1 if, and only if, (x3, x2, x1, x0) belongs to the 2-dimensional cube of Fig. 2.33. This example suggests another definition. Definition 2.2 A cube is a set of elements of B2n where a product of literals (Definitions 2.1) is equal to 1. In this chapter switching functions have been expressed under the form of sums of products of literals (e.g., (2.19) and (2.20)), and to those expressions correspond implementations by means of logic gates (e.g., Figs. 2.17 and 2.16). According to Definition 2.2, a set of elements of B2n where a product of literals is equal to 1 is a cube. Thus, a sum of product of literals can also be defined as a union of cubes that defines the set of points of B2n where f ¼ 1. In what follows cube and product of literals are considered as synonymous. How can a product of literals be represented within a computer programming environment? For that an order of the variables must be defined, for example (as above) xn1, xn2, . . ., x1, x0. Then, consider a product p of literals. It is represented by an n-component ternary vector ( pn1, pn2, . . ., p1, p0) where • pi ¼ 0 if xi is in p under inverted form (xi ). • pi ¼ 1 if xi is in p under non-inverted form (xi). • pi ¼ X if xi is not in p. Example 2.9 (with n ¼ 4) The set of cubes that describes (2.31d) is {X000, 001X, 0X10, 0101}, and the set of cubes that corresponds to (2.34g) is {X0X0, X10X, XX10, 1XXX}. Conversely, the product of literals represented by 1X01 is x3 x1 x0 and the product of literals represented by X1X0 is x2 x0 .

2.5

Synthesis Tools

47

Fig. 2.34 Union of cubes

110 010

110

111 010

011 100

101 001

a.

2.5.3

010

011 100

000

110

111

000

011 100

101 001

111

000

b.

101 001

c.

Adjacency

Adjacency is the basic concept that permits to optimize Boolean expressions. Two m-dimensional cubes are adjacent if their associated ternary vectors differ in only one position. As an example (n ¼ 3), the 1-dimensional cubes X11 (Fig. 2.34a) and X01 (Fig. 2.34b) are adjacent and their union is a 2-dimensional cube XX1 ¼ X11 [ X01 (Fig. 2.34c). The corresponding products of literals are the following: X11 represents x1 x0, X01 represents x1 x0 , and their union XX1 represents x0. In terms of products of literals, the union of the two adjacent cubes is the sum of the corresponding products: x1 x0 þ x1 x0 ¼ ðx1 þ x1 Þ x0 ¼ 1 x0 ¼ x0 : Thus, if a function f is defined by a union of cubes and if two cubes are adjacent, then they can be replaced by their union. The result, in terms of products of literals, is that two products of n-m literals are replaced by a single product of n m 1 literals. Example 2.10 A function f of four variables a, b, c, and d is defined by its minterms (Definition 2.1): f ða; b; c; d Þ ¼ a b c d þ a b c d þ a b c d þ a b c dþ a b c d þ a b c d: The corresponding set of cubes is f0010, 0011, 0101, 0110, 0111, 1000g: The following adjacencies permit to simplify the representation of f: 0010 [ 0011 ¼ 001X, 0110 [ 0111 ¼ 011X, 0101 [ 0111 ¼ 01X1: Thanks to the idempotence property (2.9) the same cube (0111 in this example) can be used several times. The simplified set of cubes is f001X, 011X, 01X1, 1000g: There remains an adjacency: 001X [ 011X ¼ 0X1X: The final result is

48

2 Combinational Circuits

f0X1X, 01X1, 1000g and the corresponding Boolean expression is f ¼ a c þ a b d þ a b c d: To conclude, a repeated use of the fact that two adjacent cubes can be replaced by a single cube permits to generate new Boolean expressions, equivalent to the initial one and with fewer terms. Furthermore the new terms have fewer literals. This is the basis of most automatic optimization tools. All commercial synthesis tools include programs that automatically generate optimal circuits according to some criteria such as cost, delay, or power consumption, and starting from several types of specification. For education purpose open-source tools are available, for example C. Burch (2005).

2.5.4

Karnaugh Map

In the case of switching functions of a few variables, a graphical method can be used to detect adjacencies and to optimize Boolean expressions. Consider the function f(a, b, c, d ) of Example 2.10. It can be represented by the Karnaugh map (Karnaugh 1953) of Fig. 2.35a. Observe the enumeration ordering of rows and columns (00, 01, 11, 10): the variable values that correspond to a row (a column) and to the next row (the next column) differ in only one position. 0101

01X1 0010

cd 00

01

0X10 11 10

ab 00

0

0

1

1

1

01

0

1

1

1

0

0

11

0

0

0

0

0

0

10

1

0

0

0

cd 00

01

11

10

cd 00

01

11

10

ab 00

0

0

1

1

ab 00

0

0

1

1

01

0

1

1

1

01

0

1

1

11

0

0

0

0

11

0

0

10

1

0

0

0

10

1

0

1000

a.

b.

c. 0XXX cd 00

01

11

10

ab 00

1

1

1

1

01

1

1

1

1

0

11

0

0

1

1

0

10

1

0

1

1

cd 00

01

11

10

ab 00

0

0

1

1

01

0

1

1

1

11

0

0

0

10

1

0

0

d. Fig. 2.35 Karnaugh maps

0X1X

e.

XX1X

2.5

Synthesis Tools

49

Fig. 2.36 Optimization of f

01X1 cd 00

01

11

10

ab 00

0

0

1

1

01

0

1

1

1

11

0

0

0

0

10

1

0

0

0

0X1X

1000

Fig. 2.37 Functions g and h

0X11

0X0 x1x0 00 x2 0

1

01

11

0

0

10

cd 00

01

11

10

1

00

0

0

1

0

01

0

1

1

1

11

1

0

0

0

10

1

1

0

1

011X

01X1 1

0

1

1

0 1X00

g

10X0

1X1 100X f

To each one of this graphical representation is associated a minterm of the function (a 0-dimensional cube). Several examples are shown in Fig. 2.35b. Thanks to the chosen enumeration ordering, to groups of two adjacent 1s like those of Fig. 2.35c are associated 1-dimensional cubes. To a group of four adjacent 1s like the one of Fig. 2.35d is associated a 2-dimensional cube. To groups of eight adjacent 1s like those of Fig. 2.35e (another switching function) are associated 3-dimensional cubes. Thus (Fig. 2.36) the function f(a, b, c, d ) of Example 2.10 can be expressed as the Boolean sum of three cubes 0X1X, 01X1, and 1000 so that f ¼ a c þ a b d þ a b c d: It is important to observe that the rightmost cells and the leftmost cells are adjacent, and so are also the uppermost cells and the downmost cells (as if the map were drawn on the surface of a torus). Two additional examples are given in Fig. 2.37. Function g of Fig. 2.37a can be expressed as the Boolean sum of two 1-dimensional cubes 0X0 and 1X1, so that g ¼ x2 x0 þ x2 x0 ; and function h of Fig. 2.37b can be expressed as the Boolean sum of six 1-dimensional cubes 01X1, 011X, 0X11, 10X0, 100X, and 1X00, so that h ¼ a b d þ a b c þ a c d þ a b d þ a b c þ a c d:

50

2 Combinational Circuits

Fig. 2.38 Propagation time tp

a

a

z

b

b z

tp

a.

Fig. 2.39 Example of propagation time computation

b.

c a b c d

τ NOT(c)

τ τ

τ

z

d·NOT(c)

e z 3τ

a.

2.6

b.

Propagation Time

Logic components such as gates are physical systems. Any change of their state, for example the output voltage transition from some level to another level, needs some quantity of energy and therefore some time (zero delay would mean infinite power). Thus, apart from their function (AND2, OR3, NAND4, and so on), logic gates are also characterized by their propagation time (delay) between inputs and outputs. Consider a simple NOR2 gate (Fig. 2.38a). Assume that initially a ¼ b ¼ 0. Then z ¼ NOR (0, 0) ¼ 1 (Fig. 2.38b). When b rises from 0 to 1, then NOR(0, 1) ¼ 0 and z must fall from 1 to 0. However the output state change is not immediate; there is a small delay tp generally expressed in nanoseconds (ns) or picoseconds (ps). Example 2.11 The circuit of Fig. 2.39a implements a 5-variable switching function z ¼ a b þ c d þ e. Assume that all components (AND2, NOT, OR3) have the same propagation time τ ns. Initially a ¼ 0, b is either 0 or 1, c ¼ 1, d ¼ 1, and e ¼ 0. Thus z ¼ 0 b þ 1 1 þ 0 ¼ 0. If c falls from 1 down to 0 then the new value of z must be z ¼ 0 b þ 0 1 þ 0 ¼ 1. However this output state change takes some time: the inverter output c changes after τ ns; the AND2 output c d changes after 2τ ns, and the OR3 output z changes after 3τ ns (Fig. 2.39b). Thus, the propagation time of a circuit depends on the component propagation times but also on the circuit itself. Two different circuits could implement the same switching circuit but with different propagation times.

2.6

Propagation Time

51 abcde

a b c d e

8-input OR gate

z

k g

z

h i j

kgh i j

a.

b.

Fig. 2.40 Two circuits that implement the same function f Fig. 2.41 n-Bit comparator

X Y

n n

n-bit comparator

G L E

Example 2.12 The two following expressions define the same switching function z: z ¼ ða þ bÞ ðc þ dÞ e þ ðk þ gÞ ðh þ iÞ j, z ¼ a c e þ a d e þ b c e þ b d e þ k h j þ k i j þ g h j þ g i j: The corresponding circuits are shown in Fig. 2.40a, b. The circuit of Fig. 2.40a has 7 gates and 16 gate inputs while the circuit of Fig. 2.40b has 9 gates and 32 gate inputs. On the other hand, if all gates are assumed to have the same propagation time τ ns, then the circuit of Fig. 2.40a has a propagation time equal to 3τ ns while the circuit of Fig. 2.40b has a propagation time equal to 2τ ns. Thus, the circuit of Fig. 2.40a could be less expensive in terms of number of transistors but with a longer propagation time than the circuit of Fig. 2.40b. In function of the system specification, the designer will have to choose between a faster but more expensive implementation or a slower and cheaper implementation (speed vs. cost balance). A more realistic example is now presented. An n-bit comparator (Fig. 2.41) is a circuit with two nbit inputs X ¼ xn1xn2 . . . x0 and Y ¼ yn1yn2 . . . y0 that represent two naturals and three 1-bit outputs G (greater), L (lower), and E (equal). It works as follows: G ¼ 1 if X > Y, otherwise G ¼ 0; L ¼ 1 if X < Y, otherwise L ¼ 0; E ¼ 1 if X ¼ Y, otherwise E ¼ 0. A step-by-step algorithm can be used. For that, the pairs of bits (xi, yi) are sequentially explored starting from the most significant bits (xn1, yn1). Initially G ¼ 0, L ¼ 0, and E ¼ 1. As long as xi ¼ yi, the values of G, L, and E do not change. When for the first time xi 6¼ yi, there are two possibilities: if xi > yi then G ¼ 1, L ¼ 0, and E ¼ 0, and if if xi < yi then G ¼ 0, L ¼ 1, and E ¼ 0. From this step, the values of G, L, and E do not change any more.

52

2 Combinational Circuits

Table 2.18 Magnitude comparison X Y G L E

1 1 0 0 1

0 0

0 0 0 0 1

1 1 0 0 1

1 0 1 0 0

0 or 1 0 or 1 1 0 0

xn-1 yn-1

xn-2 yn-2

xn-3 yn-3

1-bit comparator

1-bit comparator

1-bit comparator

0 or 1 0 or 1 1 0 0

0 or 1 0 or 1 1 0 0

0 or 1 0 or 1 1 0 0

x0 y1

···

1-bit comparator

G L E

Fig. 2.42 Comparator structure

Algorithm 2.2 Magnitude Comparison G ¼ 0; L ¼ 0; E ¼ 1; for i in n-1 downto 0 loop if E ¼ 1 and xi> yithen G ¼ 1; L ¼ 0; E ¼ 0; elsif E ¼ 1 and xi< yithen G ¼ 0; L ¼ 1; E ¼ 0; end if; end loop;

This method is correct because in binary the weight of bits xi and yi is 2i and is greater than 2 þ 2i2 þ . . . þ 20 ¼ 2i 1. An example of computation is given in Table 2.18 with n ¼ 8, X ¼ 1011---- and Y ¼ 1010----. The corresponding circuit structure is shown in Fig. 2.42. Obviously E ¼ 1 if G ¼ 0 and L ¼ 0 so that E ¼ NOR (G, L). Every block (Fig. 2.43) executes the loop body of Algorithm 2.2 and is defined by the following Boolean expressions where Ei ¼ Gi Li : i1

Gi1 ¼ Ei xi yi þ Ei Gi ¼ Gi Li xi yi þ ðGi þ Li Þ Gi ¼ Li xi yi þ Gi ;

ð2:35Þ

Li1 ¼ Ei xi yi þ Ei Li ¼ Gi Li xi yi þ ðGi þ Li Þ Li ¼ Gi xi yi þ Li

ð2:36Þ

The circuit that implements (2.35) and (2.36) is shown in Fig. 2.44. It contains 8 gates (including the inverters) and 14 gate inputs, and the propagation time is 3τ ns assuming as before that all components (NOT, AND3, and OR2) have the same delay τ ns. The complete n-bit comparator (Fig. 2.42) contains 8n þ 1 gates and 14n þ 2 gate inputs and has a propagation time equal to (3n þ 1)τ ns. Instead of reading the bits of X and Y one at a time, consider an algorithm that reads two bits of X and Y at each step. Assume that n ¼ 2 m. Then the following Algorithm 2.3 is similar to Algorithm 2.2. The difference is that two successive bits x2jþ1 and x2j of X and two successive bits y2jþ1 and y2j of Y are considered. Those pairs of bits can be interpreted as quaternary digits (base-4 digits).

2.6

Propagation Time

53

Fig. 2.43 1-Bit comparator

xi Gi Li

Fig. 2.44 1-Bit comparator implementation

yi

1-bit comparator

Li xi yi

Gi-1 Li-1

Gi-1

Gi

Li-1

xn-1 xn-2 yn-1 yn-2 xn-3 xn-4 yn-3 yn-4 xn-5 xn-6 yn-5 yn-6 0 0

2-bit comparator

2-bit comparator

2-bit comparator

x1 x0 y1 y0

···

G L

2-bit comparator

E

Fig. 2.45 Comparator structure (version 2) Fig. 2.46 2-Bit comparator

xi xi-1 Gj Lj

yi yi-1

2-bit comparator

Gj-1 Lj-1

Algorithm 2.3 Magnitude Comparison, Version 2 G ¼ 0; L ¼ 0; E ¼ 1; for j in m-1 downto 0 loop if E ¼ 1 and x2jþ1 2j> y2jþ1 2jthen G ¼ 1; L ¼ 0; E ¼ 0; elsif E ¼ 1 and x2jþ1 2j< y2jþ1 2jthen G ¼ 0; L ¼ 1; E ¼ 0; end if; end loop;

The corresponding circuit structure is shown in Fig. 2.45. Every block (Fig. 2.46) executes the loop body of Algorithm 2.3 and is defined by Table 2.19 to which correspond the following equations: Gj1 ¼ Lj xi1 yi yi1 þ Lj xi yi þ Lj xi xi1 yi1 þ Gj ;

ð2:37Þ

Lj1 ¼ Gj yi1 xi xi1 þ Gj yi xi þ Gj yi yi1 xi1 þ Lj ;

ð2:38Þ

where i ¼ 2j þ 1.

54

2 Combinational Circuits

Table 2.19 2-Bit comparator definition Gj 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1

Lj 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1

xi 0 0 0 0 0 0 1 1 1 1 1 1 – – –

xi1 0 0 0 1 1 1 0 0 0 1 1 1 – – –

Fig. 2.47 2-Bit comparator implementation

yi 0 1 – 0 0 1 0 1 1 0 – 1 – – –

yi1 0 – 1 0 1 – – 0 1 – 0 1 – – –

Lj

Gj

Gj1 0 0 0 1 0 0 1 0 0 1 1 0 0 1 –

Lj1 0 1 1 0 0 1 0 0 1 0 0 0 1 0 –

xi xi-1 yi yi-1

Gj-1

Lj-1

Table 2.20 Comparison between the circuits of Figs. 2.42 and 2.45 Circuit Figure 2.42 Figure 2.45

Gates 8n þ 1 7n þ 1

Gate inputs 14n þ 2 18n þ 2

Propagation time (3n þ 1)τ (1.5n þ 1)τ

The circuit that implements (2.37) and (2.38) is shown in Fig. 2.47. It contains 14 gates (including the inverters) and 36 gate inputs, and the propagation time is 3τ ns assuming as before that all components (NOT, AND3, AND4, and OR4) have the same delay τ ns. The complete n-bit comparator (Fig. 2.45), with n ¼ 2m, contains 14m þ 1 ¼ 7n þ 1 gates and 36m þ 2 ¼ 18n þ 2 gate inputs and has a propagation time equal to (3m þ 1)τ ¼ (1.5n þ 1)τ ns. To summarize (Table 2.20) the circuit of Fig. 2.45 has fewer gates, more gate inputs, and a shorter propagation time than the circuit of Fig. 2.42 (roughly half the propagation time).

2.7

Other Logic Blocks

2.7

55

Other Logic Blocks

Apart from the logic gates, some other components are available and can be used to implement combinational circuits.

2.7.1

Multiplexers

The circuit of Fig. 2.48a is a 1-bit 2-to-1 multiplexer (MUX2-1). It has two data inputs x0 and x1, a control input c, and a data output y. It works as follows (Fig. 2.48b): when c ¼ 0 the data output y is connected to the data input x0 and when c ¼ 1 the data output y is connected to the data input x1. So, the main function of a multiplexer is to implement controllable connections. A typical application is shown in Fig. 2.49: the input of circuit_C can be connected to the output of either circuit_A or circuit_B under the control of signal control: • If control ¼ 0, circuit_C input ¼ circuit_A output. • If control ¼ 1, circuit_C input ¼ circuit_B output. More complex multiplexers can be defined. An m-bit 2n-to-1 multiplexer has 2nm-bit data inputs x0, x1, . . ., x2n 1 , an n-bit control input c, and an m-bit data output y. It works as follows: if c is equal to the binary representation of natural i, then y ¼ xi. Two examples are given in Fig. 2.50: in Fig. 2.50a the symbol of an m-bit 2-to-1 multiplexer is shown, and in Fig. 2.50b the symbol and the truth table of a 1-bit 4-to-1 multiplexer (MUX4-1) are shown. Fig. 2.48 1-Bit 2-to-1 multiplexer

c x0

c 0 1

0 y

x1

1

y x0 x1

a.

Fig. 2.49 Example of controllable connection

b.

control

circuit_A

0 circuit_C 1 circuit_B

Fig. 2.50 Examples of multiplexers

c1 c0

c x0

m

x1

m

x0

0 m

1

a.

y

x1 x2 x3

00 01 10 11

y

b.

c1 c0

y

00 01 10 11

x0 x1 x2 x3

56

2 Combinational Circuits c0 c1 c0 2 x01 x00 x11 x10

2

00

2

x21 x20

2

x31 x30

2

01

2

x01

0

x11

1

c0

c1

0

x10

1

c1

0

y1 y0

10

x00

0 y0

y1 1

11

a.

1

x21

0

x20

0

x31

1

x30

1

b. Fig. 2.51 2-Bit MUX4-1 implemented with six 1-bit MUX2-1 Fig. 2.52 MUX2-1 is a universal module

a 0

a b

0

a

0

b

1

1

1

1

0 0 0 0 1 1 1 1

0 0 0 1 0 1 1 1

1

0

1

1

1

0

0 1

0

1

1

0

0

0

1

1

0

1

0

1

0

1

1

0

0

0

0

0

0 1 1 0 1 0 0 1

0

1

x0

x0 y1 y0

0 1 0 1 0 1 0 1

0 NOT(a)

1

x2 x1 x0 0 0 1 1 0 0 1 1

1 a+b

a·b

1

x1 x2

a.

y1

x1

1

0

0

1

0

1

0

1 0

0

1

1

0

x2 y0

b.

Fig. 2.53 Implementation of two 3-variable switching functions

In fact, any multiplexer can be built with 1-bit 2-to-1 multiplexers. For example, Fig. 2.51a is the symbol of a 2-bit MUX4-1 and Fig. 2.51b is an implementation consisting of six 1-bit MUX2-1. Multiplexers can also be used to implement switching functions. The function executed by the 1-bit MUX2-1 of Fig. 2.48 is y ¼ c x0 þ c x1 :

ð2:39Þ

In particular, MUX2-1 is a universal module (Fig. 2.52): • If c ¼ a, x0 ¼ 0 and x1 ¼ b, then y ¼ a b. • If c ¼ a, x0 ¼ b and x1 ¼ 1, then y ¼ a b þ a ¼ a þ b. • If c ¼ a, x0 ¼ 1 and x1 ¼ 0, then y ¼ a. Furthermore, any switching function of n variables can be implemented by a 2n-to-1 multiplexer. As an example, consider the 3-variable switching functions y1 and y0 of Fig. 2.53a. Each of them can be implemented by a MUX8-1 that in turn can be synthesized with seven MUX2-1 (Fig. 2.53b). The

2.7

Other Logic Blocks

57

Fig. 2.54 Optimization rules

x

x 1

1

0

0

x

x

≡

a

1

a

1

b

0

b

0

≡ a.

Fig. 2.55 Optimized circuits

a

1

b

0

x0

x1 1 x0

b.

x1 1

1 x2

0 1

y1

0

1

1

0

1 y0 0

0

0

x2

0

1

1

0

0

three variables x2, x1, and x0 are used to control the connection of the output (y1 or y0) to a constant value as defined in the function truth table. In many cases the circuit can be simplified using simple and obvious rules. Two optimization rules are shown in Fig. 2.54. In Fig. 2.54a if x ¼ 0 then the multiplexer output is equal to 0 and if x ¼ 1 then the multiplexer output is equal to 1. Thus the multiplexer output is equal to x. In Fig. 2.54b two multiplexers controlled by the same variable x and with the same data inputs can be replaced by a unique multiplexer. An optimized version of the circuits of Fig. 2.53b is shown in Fig. 2.55. Two switching function synthesis methods using multiplexers have been described. The first is to use multiplexers to implement the basic Boolean operations (AND, OR, NOT), which is generally not a good idea, rather a way to demonstrate that MUX2-1 is a universal module. The second is the use of an m-bit 2n-to-1 multiplexer to implement m functions of n variables. In fact, an m-bit 2n-to-1 multiplexer with all its data inputs connected to constant values implements the same function as a ROM storing 2nm-bit words. Then the 2n-to-1 multiplexers can be synthesized with MUX2-1 and the circuits can be optimized using rules such as those of Fig. 2.54. A more general switching function synthesis method with MUX2-1 components is based on (2.39) and on the fact that any n-variable switching function f(x0, x1, . . ., xn1) can be expressed under the form f ðx0 ; x1 ; . . . ; xn1 Þ ¼ x0 f 0 ðx1 ; . . . ; xn1 Þ þ x0 f 1 ðx1 ; . . . ; xn1 Þ

ð2:40Þ

where f 0 ðx1 ; . . . ; xn1 Þ ¼ f ð0; x1 ; . . . ; xn1 Þ and f 1 ðx1 ; . . . ; xn1 Þ ¼ f ð1; x1 ; . . . ; xn1 Þ

ð2:41Þ

are functions of n 1 variables. The circuit of Fig. 2.56 is a direct consequence of (2.40) and (2.41). In this way variable x0 has been extracted.

58

2 Combinational Circuits

Fig. 2.56 Extraction of variable x0

x0 f0 (x1, ···, xn-1)

0

f1 (x1, ···, xn-1)

1

f (x0, x1, ···, xn-1)

Fig. 2.57 MUX2-1 implementation of f

x3 1

x1

0

0 0

x0 0

1 x2

f

1 1 x2

1

0 1

Then a similar variable extraction can be performed with functions f0 and f1 (not necessarily the same variable) so that functions of n 2 variables are obtained, and so on. Thus, an iterative extraction of variables finally generates constants (0-variable functions), variables, or already generated functions. Example 2.13 Use the variable extraction method to implement the following 4-variable function: f ¼ x0 x1 x3 þ x0 x1 x2 þ x0 x2 þ x0 x3 : First extract x0: f 0 ¼ x1 x3 þ x1 x2 and f 1 ¼ x2 þ x3 : Then extract x1 from f0: f 00 ¼ x3 and f 01 ¼ x2 : Extract x2 from f1: f 10 ¼ 1 and f 11 ¼ x3 : It remains to synthesize x3 ¼ x3 1 þ x3 0. The circuit is shown in Fig. 2.57.

2.7.2

Multiplexers and Memory Blocks

ROM blocks can be used to implement switching functions defined by their truth table (Sect. 2.2) but in most cases it is a very inefficient method. However the combined use of small ROM blocks, generally called LUT, and of multiplexers permits to define efficient circuits. This is a commonly used technique in field programmable devices such as FPGAs (Chap. 7). Assume that 6-input LUTs (LUT6) are available. Then the variable extraction method of Fig. 2.56 can be iteratively applied up to the step where all obtained functions depend on at most six variables. As an example, the circuit of Fig. 2.58 implements any function of eight variables. Observe that the rightmost part of the circuit of Fig. 2.58 synthesizes a function of six variables: x6, x7 and the four LUT6 outputs. An alternative circuit consisting of five LUT6 is shown in Fig. 2.59. Figure 2.59 suggests a variable extraction method in which two variables are extracted at each step. It uses the following relation:

2.7

Other Logic Blocks

59

Fig. 2.58 Implementation of an 8-variable switching function

x0x1x2x3x4x5

x6

LUT6

0 x7

1 LUT6

0 f (x0, x1, ··· ,x7) 1

LUT6 0 1

LUT6

Fig. 2.59 Alternative circuit

x6 x7 x0x1x2x3x4x5

LUT6

LUT6

f (x0, x1, ··· ,x7)

LUT6

LUT6

LUT6

Fig. 2.60 Extraction of variables x0 and x1

x0 x1 f00 (x2, ··· , xn-1)

LUT6

f (x0, x1,x2, ··· , xn-1)

f01 (x2, ··· , xn-1) f10 (x2, ··· , xn-1) f11 (x2, ··· , xn-1)

f ðx0 , x1 , . . . , xn1 Þ ¼ x0 x1 f 00 ðx2 ; . . . ; xn1 Þ þ x0 x1 f 01 ðx2 , . . . , xn1 Þ þ x0 x1 f 10 ðx2 ; . . . ; xn1 Þ þ x0 x1 f 11 ðx2 , . . . , xn1 Þ:

ð2:42Þ

where f 00 ðx2 ; . . . ; xn1 Þ ¼ f ð0, 0, x2 , . . . , xn1 Þ, f 01 ðx2 ; . . . ; xn1 Þ ¼ f ð0, 1, x2 , . . . , xn1 Þ, f 10 ðx2 ; . . . ; xn1 Þ ¼ f ð1, 0, x2 , . . . , xn1 Þ, f 11 ðx2 ; . . . ; xn1 Þ ¼ f ð1, 1, x2 , . . . , xn1 Þ are functions of n 2 variables. The corresponding variable extraction circuit (Fig. 2.60) is a LUT6 that implements a function of six variables x0, x1, f00, f01, f10, and f11 equal to x0 x1 f 00 þ x0 x1 f 01 þ x0 x1 f 10 þ x0 x1 f 11 : Then a similar variable extraction can be performed with functions f00, f01, f10, and f11 so that functions of n 4 variables are obtained, and so on.

60

2 Combinational Circuits

Fig. 2.61 AND plane and OR plane

y0 y1

y0 y1

x0 x1 ···

AND

xn-1

OR

···

···

··· zs-1

yp-1

yp-1

a. Fig. 2.62 Switching function implementation with two planes

z0 z1

b. x0 x1

AND

···

z0 z1

OR

···

···

xn-1

Fig. 2.63 Address decoders

y0 xn-1

y1

··· ···

x1 x0

x1 x0

n

y2 - 1

zs-1

y0

x1 x0

y0

y1

y2

y3

y1

0

0

1

0

0

0

0 1 1

1 0 1

0 0 0

1 0 0

0 1 0

0 0 1

y2 y3

a.

2.7.3

b.

Planes

Sometimes AND planes and OR planes are used to implement switching functions. An (n, p) AND plane (Fig. 2.61a) implements p functions yj of n variables, where yj is a product of literals (variable or inverse of a variable): yj ¼ wj, 0 wj, 1 . . . wj, n 1 where wj, i 2 f1; xi ; xi g: An ( p, s) OR plane (Fig. 2.61b) implements s functions zj of p variables, where zj is a Boolean sum of variables: zj ¼ wj , 0 þ wj , 1 þ . . . þ wj, p 1 where wj, i 2 f0; yi g: Those planes can be configured when the corresponding integrated circuit (IC) is manufactured, or can be programmed by the user in which case they are called field programmable devices. Any set of s switching functions that are expressed as Boolean sums of at most p products of at most n literals can be implemented by a circuit made up of an (n, p) AND plane and an ( p, s) OR plane (Fig. 2.62): the AND plane generates p products of at most n literals and the OR plane generates s sums of at most p terms. Depending on the manufacturing technology and on the manufacturer those AND-OR plane circuits receive different names such as programmable array of logic (PAL), Programmable Logic Array (PLA), Programmable Logic Device (PLD), and others.

2.7.4

Address Decoder and Tristate Buffers

Another type of useful component is the address decoder. An n-to-2n address decoder (Fig. 2.63a) has n inputs and 2n outputs and its function is defined as follows: if xn1xn2 . . . x0 is the binary

2.7

Other Logic Blocks

61

Fig. 2.64 AND-OR plane implementation of a ROM

OR x1·x0 x1·x0

x1 AND x0

x1 x0

x1·x0 x1·x0

0 1 1 0

1 1 0 1

0 1 0 0

z2

z1

z0

b.

a. z2

Fig. 2.65 4-Bit MUX4-1 implemented with an address decoder and four tristate buffers

z1

z0

y0 4 x1·x0 x1 x0

x4

x1·x0 x1·x0 x1·x0

y1 4 x4

y2 4 x4

y3 4 x4 4

z

representation of natural i, then yi ¼ 1 and all other outputs yj ¼ 0. As an example, a 2-to-4 address decoder and its truth table are shown in Fig. 2.63b. In fact, an n-to-2n address decoder implements the same function as an (n, 2n) AND plane that generates all n-variable minterms: mj ¼ wj, 0 wj, 1 . . . wj, n 1 where wj, i 2 fxi ; xi g: By connecting an n-to-2n address decoder to an (2n, s) OR plane, the obtained circuit implements the same function as a ROM storing 2ns-bit words. An example is given in Fig. 2.64a: the AND plane synthesizes the functions of a 2-to-4 address decoder and the complete circuit implements the same functions as the ROM of Fig. 2.64b. The other common application of address decoders is the control of data buses. An example is given in Fig. 2.65: a 2-to-4 address decoder generates four signals that control four 4-bit tristate buffers. This circuit permits to connect a 4-bit output z to one among four 4-bit inputs y0, y1, y2, or y3 under the control of two address bits x1 and x0. Actually, the circuit of Fig. 2.65 realizes the same function as a 4-bit MUX4-1. In Fig. 2.66a the circuit of Fig. 2.65 is used to connect one among four data sources (circuits A, B, C, and D) to a data destination (circuit E) under the control of two address bits. It executes the following algorithm: case x1 x0 is when 00 ¼> circuit_E ¼ circuit_A; when 01 ¼> circuit_E ¼ circuit_B; when 00 ¼> circuit_E ¼ circuit_C; when 00 ¼> circuit_E ¼ circuit_D; end case;

The usual symbol of this bus is shown in Fig. 2.66b.

62

2 Combinational Circuits

circ.A circ.B circ.C circ.D

4 x1·x0 x1

x1·x0

x0

x1·x0

x1·x0

x4

4 x4

circ.A circ.B circ.C circ.D

4 x4

4 x4 circ.E

4

circ.E

b.

a. Fig. 2.66 A 4-bit data bus with four data sources

2.8

Programming Language Structures

The specification of digital systems by means of algorithms (Sect. 1.2.1) is a central aspect of this course. In this section the relation between some programming language instructions and digital circuits is analyzed. This relation justifies the use of hardware description languages (HDL) similar to programming languages, as well as the generation of synthesis tools able to translate HDL descriptions to circuits.

2.8.1

If Then Else

A first example of instruction that can be translated to a circuit is the conditional branch: if a_condition then some_actions else other_actions;

As an example, consider the following binary decision algorithm. It computes the value of a switching function f of six variables x0, x1, y0, y1, y2, and y3. Algorithm 2.4 if x1 ¼ 0 then if x0 ¼ 0 then f ¼ y0; else f ¼ y1; end if; else if x0 ¼ 0 then f ¼ y2; else f ¼ y3; end if; end if;

This function can be implemented by the circuit of Fig. 2.67 in which the external conditional branch is implemented by the rightmost MUX2-1 and the two internal conditional branches are implemented by the leftmost MUX2-1s.

2.8

Programming Language Structures

Fig. 2.67 Binary decision algorithm implementation

63 x0 y0

0

y1

1

x1 0 f 1

y2

1

y3

0

Fig. 2.68 Case instruction implementation

2.8.2

x y0

00

y1

01

y2

10

y3

11

f

Case

A second example of instruction that can be translated to a circuit is the conditional switch: case variable_identifier is when variable_value1 ¼> actions1; when variable_value2 ¼> actions2; end case;

As an example, the preceding binary decision algorithm (Algorithm 2.4) is equivalent to the following, assuming that x has been previously defined as a 2-bit vector (x0, x1). Algorithm 2.5 case x is when 00 ¼> f ¼ y0; when 01 ¼> f ¼ y1; when 10 ¼> f ¼ y2; when 11 ¼> f ¼ y3; end case;

Function f can be implemented by a MUX4-1 (Fig. 2.68).

2.8.3

Loops

For-loops are a third example of easily translatable construct: for variable_identifier in variable_range loop operations using the variable value end loop;

64

2 Combinational Circuits

x

y

x3

y3

x

y

x2

y2

x

y

x1

y1

x

y

x0

y0

x

y

cyOUT cyIN

cyOUT cyIN

cyOUT cyIN

cyOUT cyIN

cyOUT cyIN

z

z

z

z

z

z3

z2

z1

z0

z4

a.

0

b.

Fig. 2.69 4-Digit decimal adder

To this type of instruction can often be associated an iterative circuit. As an example, consider the following addition algorithm that computes z ¼ x þ y where x and y are 4-digit decimal numbers, so that z is a 5-digit decimal number. Algorithm 2.6 Addition of Two 4-Digit Naturals cy0¼ 0; for i in 0 to 3 loop --------- loop body: si¼ xiþ yiþ cyi; if si> 9 then zi¼ si- 10; cyiþ1¼ 1; else zi¼ si; cyiþ1¼ 0; end if; --------- end of loop body: end loop; z4¼ cy4;

The corresponding circuit is shown in Fig. 2.69b. It is an iterative circuit that consists of four identical blocks. Each of them is a 1-digit adder (Fig. 2.69a) that implements the loop body of Algorithm 2.6. Comments 2.5 • Other (not combinational but sequential) loop implementation methods will be studied in Chap. 4. • Not any loop can be implemented by means of an iterative combinational circuit. Consider a while-loop: while a_condition loop operations end loop;

The loop body is executed as long as some condition (that can be modified by the operations) is true. If the maximum number of times that the condition will be true is either unknown or is a too large number, a sequential implementation (Chap. 4) must be considered.

2.8

Programming Language Structures

Fig. 2.70 Implementation of procedure calls

65 w(1) = 0 a

w(2)

a

x(1) y(1) b MAC d

c

x(2) y(2) b MAC d

c

······· w(8)

a

x(8) y(8)

b MAC d

c

z= w(9)

2.8.4

Procedure Calls

Procedure (or function) calls constitute a fundamental aspect of well-structured programs and can be associated to hierarchical circuit descriptions. The following algorithm computes z ¼ x1 y1 þ x2 y2 þ . . . þ x8 y8. For that it makes several calls to a previously defined procedure multiply and accumulate (MAC) to which it passes four parameters a, b, c, and d. The procedure call MAC(a, b, c, d) executes d ¼ a þ b c. Algorithm 2.7 z ¼ x1 y1 þ x2 y2 þ . . . þ x8 y8 w(1) ¼ 0; for i in 1 to 8 loop MAC(w(i), x(i), y(i), w(iþ1)); end loop; z ¼ w(9);

Thus w2 ¼ 0 þ x1 y1 ¼ x1 y1 , w2 ¼ x1 y1 þ x2 y2 , w3 ¼ x1 y1 þ x2 y2 þ x3 y3 , . . . , z ¼ w9 ¼ x1 y1 þ x2 y2 þ x3 y3 þ . . . þ x8 y8 : The corresponding circuit is shown in Fig. 2.70. Algorithm 2.7 is a for-loop to which is associated an iterative circuit. The loop body is a procedure call to which corresponds a component MAC whose functional specification is d ¼ a þ b c. This is an example of top-down hierarchical description: an iterative circuit structure (the top level) whose components are defined by their function and afterwards must be implemented (the down level).

66

2 Combinational Circuits

2.8.5

Conclusion

There are several programming language constructs that can easily be translated to circuits. This fact justifies the use of formal languages to specify digital circuits, either classical programming languages such as C/Cþþ or specific HDL such as VHDL or Verilog. In this course VHDL will be used (Appendix A). The relation between programming language instructions and circuits also explains why it has been possible to develop software packages able to synthesize circuits starting from functional descriptions in some formal language.

2.9

Exercises

1. Synthesize with logic gates the function z of Table 2.3. 2. Generate Boolean expressions of functions f, g, and h of three variables x2, x1, and x0 defined by the following table: x2x1x0 000 001 010 011 100 101 110 111

f 1 – 0 1 – 1 0 0

g 0 – 0 1 1 1 – 0

h 1 1 – – 0 0 1 –

3. Simplify the following sets of cubes (n ¼ 4): f0000, 0010, 01x1, 0110, 1000: 1010g, f0001, 0011, 0100, 0101, 1100, 1110, 1011, 1010g, f0000, 0010, 1000, 1010, 0101, 1101, 1111g: 4. The following circuit consists of seven identical components with two inputs a and b, and two outputs c and d. The maximum propagation time from inputs a or b to outputs c or d is equal to 0.5 ns. What is the maximum propagation time from any input to any output (in ns)?

c

c

a d

b

c

a d a d

b

c

b

c

a d a d

b

c

b

c

a d a d

b

b

References

67

5. Compute an upper bound Nmax and a lower bound Nmin of the number N of functions that can be implemented by the following circuit.

x1x0 x5 x4 x3 x2

LUT

x5 x4 x3 x2

LUT

LUT

f

6. Implement with MUX2-1 components the switching functions of three variables x2, x1, and x0 defined by the following set of cubes: f11x, 101, 011g, f111, 100, 010, 001g, f1x1, 0x1g: 7. What set of cubes defines the function f(x5, x4, x3, x2, x1, x0) implemented by the following circuit? x5 x4 x3 x1 x0

x2

f

8. Minimize the following Boolean expression: f ða; b; c; dÞ ¼ a:b:c:d þ a:b þ a:b:c þ a:b þ a:c: 9. Implement the circuits of Figs. 2.14, 2.16, and 2.17 with NAND gates. 10. Implement (2.34) with NAND gates.

References Burch C (2005) Logisim. http://www.cburch.com/logisim/es/index.html Karnaugh M (1953) The map method for synthesis of combinational logic circuits. Trans Inst Electr Eng (AIEE) Part I 72(9):593–599

3

Arithmetic Blocks

Arithmetic circuits are an essential part of many digital circuits and thus deserve a particular treatment. In this chapter implementations of the basic arithmetic operations are presented. Only operations with naturals (nonnegative integers) are considered. A much more detailed and complete presentation of arithmetic circuits can be found in Parhami (2000), Ercegovac and Lang (2004), Deschamps et al. (2006), and Deschamps et al. (2012).

3.1

Binary Adder

Binary adders have already been described several times (e.g., Figs. 2.5 and 2.27). Given two n-bit naturals x and y and an incoming carry bit cy0, an n-bit adder computes an (n + 1)-bit number s ¼ x + y + cy0, in which sn can be used as an outgoing carry bit cyn. The classical pencil and paper algorithm, adapted to the binary system, is the following: Algorithm 3.1 Binary Adder: s ¼ x + y + cy0 for i in 0 to n-1 loop si ¼ xi xor yi xor cyi; cyi+1 ¼ (xi and yi)) or (xi and cyi)or (yi and cyi); end loop; sn ¼ cyn;

At each step si ¼ ðxi þ yi þ cyi Þ mod 2 ¼ xi yi cyi ;

ð3:1Þ

and cyi+1 ¼ 1 if, and only if, at least two bits among xi, yi, and cyi are equal to 1, a condition that can be expressed as follows: siþ1 ¼ xi yi þ xi cyi þ yi cyi :

ð3:2Þ

The circuit that implements Algorithm 3.1 (a for-loop) is shown in Fig. 3.1. It consists of n identical blocks called full adders (FA) that implement the loop body of Algorithm 3.1 (3.1 and 3.2).

# Springer International Publishing Switzerland 2017 J.-P. Deschamps et al., Digital Systems, DOI 10.1007/978-3-319-41198-9_3

69

70

3 x1

xn-1 yn-1

Fig. 3.1 n-Bit adder

sn = cyn

FA

cyn-1

....

cy2

sn-1

3.2

x0

y1

FA

cy1

s1

Arithmetic Blocks

y0

FA

cy0 = carry

s0

Binary Subtractor

Given two n-bit naturals x and y and an incoming borrow bit b0, an n-bit subtractor computes d ¼ x y b0. Thus d 0 (2n 1) 1 ¼ 2n and d (2n 1) 0 0 ¼ 2n 1, so that d is a signed integer belonging to the range 2n d 2n 1: The classical pencil and paper algorithm, adapted to the binary system, is used. At each step the difference xi yi bi is computed and expressed under the form xi yi bi ¼ di 2 biþ1 where di and biþ1 2 f0, 1g:

ð3:3Þ

If xi yi bi < 0 then di ¼ xi yi bi + 2 and bi+1 ¼ 1; if xi yi bi 0 then di ¼ xi yi bi and bi+1 ¼ 0. At the end of step n the result is obtained under the form d ¼ dn 2n þ dn1 2n1 þ d n2 2n2 þ . . . þ d 0 20

ð3:4Þ

where dn ¼ bn is the last borrow bit. This type of representation (3.4) in which the most significant bit dn has a negative weight 2n is the 2’s complement representation of the signed integer d. In this representation dn is the sign bit. Example 3.1 Compute (n ¼ 4) 0111 1001 1: xi: yi: bi: di:

0 1 1 1 1 0 0 1 1 0 0 1 1 ________________________________________________________________________________ 1 1 1 0 1

Conclusion: 0111 1001 1 ¼ 11101. In decimal: 7 9 1 ¼ 16 + 13 ¼ 3. By reducing both members of (3.3) modulo 2 the following relation is obtained: d i ¼ x i y i bi :

ð3:5Þ

On the other hand bi+1 ¼ 1 if, and only if, xi yi bi < 0, that is, when xi ¼ 0 and either yi or bi is equal to 1, or when both yi and bi are equal to 1. This condition can be expressed as follows: biþ1 ¼ xi yi þ xi bi þ yi bi : The following algorithm computes d.

ð3:6Þ

3.3

Binary Adder/Subtractor

71

Fig. 3.2 n-Bit subtractor

xn-1 yn-1

dn = bn

FS

dn-1

x1

bn-1

....

b2

y1

FS

d1

x0

b1

y0

FS

b0

d0

Algorithm 3.2 Binary Subtractor: d ¼ x y b0 for i in 0 to n-1 loop di ¼ xi xor yi xor bi; bi+1 ¼ (not(xi)and yi) or (not(xi) and bi) or (yi and bi); end loop; dn ¼ bn;

The circuit that implements Algorithm 3.2 (a for-loop) is shown in Fig. 3.2. It consists of n identical blocks called full subtractors (FS) that implement the loop body of Algorithm 3.2 (3.5 and 3.6).

3.3

Binary Adder/Subtractor

Given two n-bit naturals x and y and a 1-bit control input a/s, an n-bit adder/subtractor computes z ¼ [x + y] mod 2n if a/s ¼ 0 and z ¼ [x y] mod 2n if a/s ¼ 1. To compute z, define y as being the natural deduced from y by inverting all its bits: y ¼ yn1 yn2 . . . y0 , and check that y ¼ 1 yn1 2n1 þ 1 yn2 2n2 þ . . . þ ð1 y0 Þ 20 ¼ 2n 1 y: ð3:7Þ Thus z can be computed as follows: z ¼ ½x þ w þ a=s mod 2n , where w ¼ y if a=s ¼ 0 and w ¼ y if a=s ¼ 1:

ð3:8Þ

In other words wi ¼ a/s yi, 8 i ¼ 0 to n 1. Algorithm 3.3 Binary Adder/Subtractor for i in 0 to n-1 loop wi ¼ a/s xor yi; end loop; z ¼ (x + w + a/s) mod 2n;

The circuit that implements Algorithm 3.3 is shown in Fig. 3.3. It consists of an n-bit adder and n XOR2 gates. An additional XOR2 gate computes ovf (overflow): if a/s ¼ 0, ovf ¼ 1 if, and only if, cyn ¼ 1 and thus x + y 2n; if a/s ¼ 1, ovf ¼ 1 if, and only if, cyn ¼ 0 and thus x + (2n 1 y) + 1 < 2n, that is to say if x y < 0.

72

3 x

Arithmetic Blocks

y a/s yn-1

yn-2

y0 a/s

w a b n-bit adder: cyn c = (a+ b+ cy ) mod 2 cy0 0 n c

wn-1

wn-2

w0

z ovf

Fig. 3.3 n-Bit adder/subtractor

Fig. 3.4 Multiplication by yi (circuit and symbol)

yi xn-1

yi · xn-1

xn-2

yi

yi · xn-2

xn-3

yi · xn-3

yi·x

x

b. x0

yi · x0

a.

3.4

Binary Multiplier

Given an n-bit natural x and an m-bit natural y a multiplier computes p ¼ x y. The maximum value of p is (2n 1) (2m 1) < 2n+m so that p is an (n + m)-bit natural. If y ¼ ym1 2m1 þ ym2 2m2 þ . . . þ y1 2 þ y0 , then p ¼ x ym1 2m1 þ x ym2 2m2 þ . . . þ x y1 2 þ x y0 :

ð3:9Þ

The preceding expression can be computed as follows. First compute a set of partial products p0 ¼ x y0, p1 ¼ x y1 2, p2 ¼ x y2 22, . . . , pm1 ¼ x ym1 2m1, and then add the m partial products: p ¼ p0 + p1 + p2 + . . . + pm1. The computation of each partial product pi ¼ x yi 2i is very easy. The product x yi is computed by a set of AND2 gates (Fig. 3.4) and the multiplication by 2i amounts to adding i 0s to the right of the binary representation of x yi. For example, if i ¼ 5 and x y5 ¼ 10010110 then x y5 25 ¼ 1001011000000. The computation of p ¼ p0 þ p1 þ p2 þ . . . þ pm1 can be executed by a sequence of 2-operand additions: p ¼ ð. . . ððð0 þ p0 Þ þ p1 Þ þ p2 Þ . . .Þ þ pm1 . The following algorithm computes p.

3.4

Binary Multiplier

73 0

x

acc_in

b

Fig. 3.5 Binary multiplier

acc_in

b a

y0

a acc_out x·2

adder acc_in

b y1

a acc_out

acc_out

x·22 acc_in

b a

y2

acc_out ···· x·2m-1 acc_in

b a

ym-1

acc_out p

Algorithm 3.4 Binary Multiplier: p ¼ x y (Right to Left Algorithm) acc0 ¼ 0; for i in 0 to m-1 loop acci+1 ¼ acci + xyi2i; end loop; p ¼ accm;

Example 3.2 Compute (n ¼ 5, m ¼ 4) 11101 1011 (in decimal 29 11). The values of acci are the following: acc0: acc1: acc2: acc3: acc4:

0 0 0 0 1

0 0 0 0 0

0 0 1 1 0

0 0 0 0 1

0 1 1 1 1

0 1 0 0 1

0 1 1 1 1

0 0 1 1 1

0 1 1 1 1

Result: p ¼ 100111111 (in decimal 319). The circuit that implements Algorithm 3.4 (a for-loop) is shown in Fig. 3.5. It consists of m identical blocks that implement the loop body of Algorithm 3.4: accout ¼ accin þ b a where b ¼ x 2i and a ¼ yi : Comment 3.1 The building block of Fig. 3.5 computes accin + b a. At step i input b ¼ x 2i is an (n + i)-bit number. In particular at step m 1 it is an (n + m 1)-bit number. Thus, if all blocks are identical,

74

3

Arithmetic Blocks

acc_in (i+n-1··i) x n n

Fig. 3.6 Optimized block

yi

adder n

acc_in (i-1··0) i

acc_out (i+n··i+1) acc_out (i··0)

they must include an (n + m 1)-bit adder and n + m 1 AND2 gates. Nevertheless, for each i this building block can be optimized. At step number i it computes accout ¼ accin + x yi 2i where accin is an (i + n)-bit number (at each step one bit is added). On the other hand the rightmost i bits of x yi 2i are equal to 0. Thus accout ði þ n iÞ ¼ accin ði þ n 1 iÞ þ x yi ; accout ði 1 0Þ ¼ accin ði 1 0Þ: The corresponding optimized block is shown in Fig. 3.6. Each block contains an n-bit adder and n AND2 gates.

3.5

Binary Divider

Division is the more complex operation. Given two naturals x and y their quotient q ¼ x/y is usually not an integer. It is a so-called rational number. In many cases it is not even a fixed-point number. The desired accuracy must be taken into account. The quotient q with an accuracy of p fractional bits is defined by the following relation: x=y ¼ q þ e where q is a multiple of 2p and e < 2p :

ð3:10Þ

In other words q is a fixed-point number with p fractional bits q ¼ qm1 qm2 . . . q0 :q1 q2 . . . qp

ð3:11Þ

such that the error e ¼ x/y q is smaller than 2p. Most division algorithms work with naturals x and y such that x < y, so that q ¼ 0 :q1 q2 . . . qp :

ð3:12Þ

Consider the following sequence of integer divisions by y with x ¼ r0 < y: 2 r 0 ¼ q1 y þ r 1 with r 1 < y, 2 r 1 ¼ q2 y þ r 2 with r 2 < y, 2 r p 2 ¼ qpþ1 y þ r p 1 with r p 1 < y, 2 r p 1 ¼ qp y þ r p with r p < y:

ð3:13Þ

3.5

Binary Divider

75

At each step qi and ri are computed in function of ri1 and y so that the following relation holds true: 2 r i1 ¼ qi y þ r i :

ð3:14Þ

For that • Compute d ¼ 2 ri1 y. • If d < 0 then qi ¼ 0 and ri ¼ 2 ri1; else qi ¼ 1 and ri ¼ d.

Property 3.1 x=y ¼ 0:q1 q2 . . . qpþ1 qp þ r p =y 2p with r p =y 2p < 2p : Proof Multiply the first equation of (3.13) by 2p1, the second by 2p2, and so on. Thus 2p r 0 ¼ 2p1 q1 y þ 2p1 r 1 with r 1 < y, 2p1 r 1 ¼ 2p2 q2 y þ 2p2 r 2 with r 2 < y, 2 r p 2 2

¼ 2 qpþ1 y þ 2 r p 1 with r p 1 < y,

2 r p 1 ¼ qp y þ r p with r p < y: Then add up the p equations: 2p r 0 ¼ 2p1 q1 y þ 2p2 q2 y þ . . . þ 2 qpþ1 y þ qp y þ r p with r p < y; so that x ¼ 21 q1 þ 22 q2 þ . . . þ 2pþ1 qpþ1 þ 2p qp y þ r p 2p with r p 2p < y 2p ; and x=y ¼ 0:q1 q2 . . . qpþ1 qp þ r p =y 2p with r p =y 2p < 2p :

Example 3.3 Compute 21/35 with an accuracy of 6 bits: 2 2 2 2 2 2

21 ¼ 1 35 + 7 7 ¼ 0 35 + 14 14 ¼ 0 35 + 28 28 ¼ 1 35 + 21 21 ¼ 1 35 + 7 7 ¼ 0 35 + 14

Thus q ¼ 0.100110. In decimal: q ¼ 38/26. Error ¼ 21/35 38/26 ¼ 0.6 0.59375 ¼ 0.00625 < 26 ¼ 0.015625. The following algorithm computes q with an accuracy of p fractional bits.

76

3 y

x

Fig. 3.7 Binary divider

Arithmetic Blocks

r r q-1

q

q

r+

y 0

y sign

subtractor

2·r r q-2

y

1

0

q r+ r+ r

q-3

y

q r+ ····

r q-p

q

y r+ rp

Algorithm 3.5 Binary Divider: q ﬃ x/y, Error < 2p (Restoring Algorithm) r0 ¼ x; for i in 1 to p loop d ¼ 2ri-1 - y; if d < 0 then q-i ¼ 0; ri ¼ 2ri-1; else q-i ¼ 1; ri ¼ d; end if; end loop;

The circuit that implements Algorithm 3.5 (a for-loop) is shown in Fig. 3.7. It consists of p identical blocks that implement the loop body of Algorithm 3.5.

3.6

Exercises

1. An integer x can be represented under the form (1)s m where s is the sign of x and m is its magnitude (absolute value). Design an n-bit sign-magnitude adder/subtractor. 2. An incrementer-decrementer is a circuit with two n-bit inputs x and m, one binary control input up/ down, and one n-bit output z. If up/down ¼ 0, it computes z ¼ (x + 1) mod m, and if up/down ¼ 1, it computes z ¼ (x 1) mod m. Design an n-bit incrementer-decrementer. 3. Consider the circuit of Fig. 3.5 with the optimized block of Fig. 3.6. The n-bit adder of Fig. 3.6 can be implemented with n 1-bit adders (full adders). Define a 1-bit multiplier as being a component

References

77

with four binary inputs a, b, c, d, and two binary outputs e and f, that computes a b + c + d and expresses the result as 2 e + f (a 2-bit number). Design an n-bit-by-m-bit multiplier consisting of 1-bit multipliers. 4. Synthesize a 2n-bit-by-2n-bit multiplier using n-bit-by-n-bit multipliers and n-bit adders as components. 5. A mod m reducer is a circuit with two n-bit inputs x and m (m > 2) and one n-bit output z ¼ x mod m. Synthesize a mod m reducer.

References Deschamps JP, Gioul G, Sutter G (2006) Synthesis of arithmetic circuits. Wiley, New York Deschamps JP, Sutter G, Canto´ E (2012) Guide to FPGA implementation of arithmetic functions. Springer, Netherlands Ercegovac M, Lang T (2004) Digital arithmetic. Morgan Kaufmann Publishers, San Francisco Parhami B (2000) Computer arithmetic. Oxford University Press, Oxford

4

Sequential Circuits

The digital systems that have been defined and implemented in the preceding chapters are combinational circuits. If the component delays are not taken into account, that means that the value of their output signals only depends on the values of their input signals at the same time. However, many digital system specifications cannot be implemented by combinational circuits because the value of an output signal could be a function of not only the value of the input signals at the same time, but also the value of the input signals at preceding times.

4.1

Introductory Example

Consider the vehicle access control system of Fig. 4.1. It consists of • • • •

A gate that can be raised and lowered by a motor A push button to request the access Two sensors that detect two particular gate positions (upper and lower) A sensor that detects the presence of a vehicle within the gate area The motor control system has four binary input signals:

• • • •

Request equal to 1 when there is an entrance request (push button) Lower equal to 1 when the gate has been completely lowered Upper equal to 1 when the gate has been completely raised Vehicle equal to 1 if there is a vehicle within the gate area The binary output signals on/off and up/down control the motor:

• To raise the gate on/off ¼ 1 and up/down ¼ 1 • To lower the gate on/off ¼ 1 and up/down ¼ 0 • To maintain the gate open or closed on/off ¼ 0 The motor control system cannot be implemented by a combinational circuit. As an example, if at some time request ¼ 0, vehicle ¼ 0, upper ¼ 0, and lower ¼ 0, this set of input signal values could # Springer International Publishing Switzerland 2017 J.-P. Deschamps et al., Digital Systems, DOI 10.1007/978-3-319-41198-9_4

79

80

4

request

Sequential Circuits

motor ON/OFF

request vehicle upper

motor control

UP/DOWN

lower

gate area

Fig. 4.1 Vehicle access control

correspond to two different situations: (1) a vehicle is present in front of the gate, the request button has been pushed and released, and the gate is moving up, or (2) a vehicle has got in and the gate is moving down. In the first case on/off ¼ 1 and up/down ¼ 1; in the second case on/off ¼ 1 and up/ down ¼ 0. In conclusion, the values of the signals that control the motor depend on the following sequence of events: 1. 2. 3. 4. 5. 6.

Wait for request ¼ 1 (entrance request) Raise the gate Wait for upper ¼ 1 (gate completely open) Wait for vehicle ¼ 0 (gate area cleared) Lower the gate Wait for lower ¼ 1 (gate completely closed)

A new entrance request is not attended until this sequence of events is completed. Conclusion: Some type of memory is necessary in order to store the current step number (1–6) within the sequence of events.

4.2

Definition

Sequential circuits are digital systems with memory. They implement systems whose output signal values depend on the input signal values at times t (the current time), t 1, t 2, and so on (the precise meaning of t 1, t 2, etc. will be defined later). Two simple examples are sequence detectors and sequence generators. Example 4.1 (Sequence Detector) Implement a circuit (Fig. 4.2a) with a decimal input x and a binary output y. It generates an output value y ¼ 1 every time that the four latest inputted values were 1 5 5 7. It is described by the following instruction in which t stands for the current time: if x(t-3) ¼ 1 AND x(t-2) ¼ 5 AND x(t-2) ¼ 5 AND x(t) ¼ 7 then y ¼ 1; else y ¼ 0; end if;

4.2

Definition

Fig. 4.2 Sequence detector and sequence generator

81

x

sequence detector

y

a.

Fig. 4.3 Sequential circuit

sequence generator

y

b.

x0 … xn-1

… combinational circuit

…

q0 qm-1

y0 yk-1

…

memory

q 0Δ qm-1Δ

Thus, the corresponding circuit must store x(t 3), x(t 2), and x(t 1) and generates y in function of the stored values and of the current value of x. Example 4.2 (Sequence Generator) Implement a circuit (Fig. 4.2b) with a binary output y that continuously generates the output sequence 011011011011 . It is described by the following instruction in which t stands for the current time: if y(t-2) ¼ 1 AND y(t-1) ¼ 1 then y ¼ 0; else y ¼ 1; end if;

The corresponding circuit must store y(t 2) and y(t 1) and generates the current value of y in function of the stored values. Initially (t ¼ 0) the stored values y(2) and y(1) are equal to 1 so that the first output value of y is 0. The general structure of a sequential circuit is shown in Fig. 4.3. It consists of • A combinational circuit that implements k + m switching functions y0, y1, . . ., yk1, q0Δ, q1Δ, . . ., qm1Δ of n + m variables x0, x1, . . ., xn1, q0, q1, . . ., qm1 • A memory that stores an m-bit vector The combinational circuit inputs x0, x1, . . ., xn1 are inputs of the sequential circuit while (q0, q1, . . ., qm1) is an m-bit vector read from the memory. The combinational circuit outputs y0, y1, . . ., yk1 are outputs of the sequential circuit while (q0Δ, q1Δ, . . ., qm1Δ) is an m-bit vector written to the memory. The way the memory is implemented and the moments when the memory contents (q0, q1, . . ., qm1) are updated and replaced by (q0Δ, q1Δ, . . ., qm1Δ) will be defined later. With this structure, the output signals y0, y1, . . ., yk1 depend not only on the current value of the input signals x0, x1, . . ., xn1 but also on the memory contents q0, q1, . . ., qm1. The values of q0, q1, . . ., qm1 are updated at time . . . t1, t, t + 1, . . . with new values q0Δ, q1Δ, . . ., qm1Δ that are generated by the combinational circuit. The following terminology is commonly used: • x0, x1, . . ., xn1 are the external inputs • y0, y1, . . ., yk1 are the external outputs

82

4

Fig. 4.4 Sequence detector implementation

Sequential Circuits

x

y combinational circuit

q0Δ

q0 q1 q2

mem.

q1Δ q2Δ

• (q0, q1, . . ., qm1) is the internal state • (q0Δ, q1Δ, . . ., qm1Δ) is the next state To summarize, • The memory stores the internal state. • The combinational circuit computes the value of the external outputs and the next state in function of the external inputs and of the current internal state. • The internal state is updated at every time unit . . . t 1, t, t + 1, . . . by replacing q0 by q0Δ, q1 by q1Δ, and so on. Example 4.3 (Sequence Detector Implementation) The sequence detector of Example 4.1 can be implemented by the sequential circuit of Fig. 4.4 in which x, q0, q1, q2, q0Δ, q1Δ, and q2Δ are 4-bit vectors that represent decimal digits. The memory must store the three previous values of x that are q0 ¼ x(t 1), q1 ¼ x(t 2), and q2 ¼ x(t 3). For that q0 Δ ¼ x, q1 Δ ¼ q0 , q2 Δ ¼ q1 :

ð4:1Þ

y ¼ 1 if, and only if, q2 ¼ 1 AND q1 ¼ 5 AND q0 ¼ 5 AND x ¼ 7:

ð4:2Þ

The output y is defined as follows:

Equations 4.1 and 4.2 define the combinational circuit function. In the previous definitions and examples the concept of current time t is used but it has not been explicitly defined. To synchronize a sequential circuit and to give sense to the concept of current time and, in particular, to define the moments when the internal state is updated, a clock signal must be generated. It is a square wave signal (Fig. 4.5) with period T. The positive edges of this clock signal define the times that have been called . . . t 1, t, t + 1, . . . and are expressed in multiples of the clock signal period T. In particular the positive edges define the moments when the internal state is replaced by the next state. In Fig. 4.5 some commonly used terms are defined. • • • • • •

Positive edge: a transition of the clock signal from 0 to 1 Negative edge: a transition of the clock signal from 1 to 0 Cycle: section of a clock signal that corresponds to one period Frequency: the number of cycles per second (1/T ) Positive pulse: the part of a clock signal cycle where clock ¼ 1 Negative pulse: the part of a clock signal cycle where clock ¼ 0

4.3

Explicit Functional Description

83 Negative edge

Positive edge cycle

Positive pulse

clock Period=T Frequency=1/T

Negative pulse

Fig. 4.5 Clock signal

Comment 4.1 Instead of using the positive edges of the clock signal to synchronize the circuit operations, the negative edges could be used. This is an essential part of the specification of a sequential circuit: positive edge triggered or negative edge triggered.

4.3

Explicit Functional Description

Explicit functional descriptions of combinational circuits (Sect. 1.2.1) are tables that define the output signal values associated to all possible combinations of input signal values. In the case of sequential circuits all possible internal states must also be considered: to different internal states correspond different relations between input and output signals.

4.3.1

State Transition Graph

A state transition graph consists of a set of vertices that correspond to the internal states and of a set of directed edges that define the internal state transitions and the output signal values in function of the input signal values. Example 4.4 The graph of Fig. 4.6b defines a sequential circuit (Fig. 4.6a) that has three internal states A, B, and C encoded with two binary variables q0 and q1 that are stored in the memory block; it has a binary input signal x and a binary output signal y. It works as follows: • • • • •

If the internal state is A and if x ¼ 0 then the next state is C and y ¼ 0. If the internal state is A and if x ¼ 1 then the next state is A and y ¼ 1. If the internal state is B then (whatever x) the next state is A and y ¼ 0. If the internal state is C and if x ¼ 0 then the next state is C and y ¼ 0. If the internal state is C and if x ¼ 1 then the next state is B and y ¼ 1.

To complete the combinational circuit (Fig. 4.6a) specification it remains to choose the encoding of states A, B, and C, for example: A : q0 q1 ¼ 00, B : q0 q1 ¼ 01, C : q0 q1 ¼ 10: The following case instruction defines the combinational circuit function:

ð4:3Þ

84

4

Sequential Circuits

x = 1; y = 1

A x

y

combinational circuit

x = 0 or 1; y = 0 …

x = 0; y = 0

B q0

q0D

memory

C

q1D

q1 a.

x = 1; y = 1

b.

x = 0; y = 0

Fig. 4.6 Example of state transition graph (Mealy model) x=1

A x0 x1

combinational circuit

y=1

y

x = 0, 1 or 2

…

x=0

B y=0

x=2

q0 q1

memory

q0D q1D

a.

C y=1

x = 1 or 2

x=0

b.

Fig. 4.7 Example of state transition graph (Moore model)

case q0q1 is when 00 ¼> if x ¼ 0 then q0Δq1Δ ¼ 10; y ¼ 0; else q0Δq1Δ ¼ 00; y ¼ 1; end if; when 01 ¼> q0Δq1Δ ¼ 00; y ¼ 0; when 10 ¼> if x ¼ 0 then q0Δq1Δ ¼ 10; y ¼ 0; else q0Δq1Δ ¼ 01; y ¼ 1; end if; when others ¼> q0Δq1Δ ¼ don’t care; y ¼ don’t care; end case;

The clock signal (Sect. 4.2) is not represented in Fig. 4.6a but it is implicitly present and it is responsible for the periodic updating of the internal state. The way that the external output signal values are defined in Fig. 4.6b corresponds to the so-called Mealy model: the value of y depends on the current internal state and on the current value of the input signal x. In the following example another method is used. Example 4.5 The graph of Fig. 4.7b. defines a sequential circuit (Fig. 4.7a) that has three internal states A, B, and C encoded with two binary variables q0 and q1 that are stored in the memory block; it has two binary input signals x0 and x1 that encode a ternary digit x 2 {0, 1, 2} and a binary output signal y. It works as follows: • If the internal state is A and if x ¼ 0 then the next state is C and y ¼ 1. • If the internal state is A and if x ¼ 1 then the next state is A and y ¼ 1.

4.3

• • • •

Explicit Functional Description

85

If the internal state is A and if x ¼ 2 then the next state is B and y ¼ 1. If the internal state is B then (whatever x) the next state is A and y ¼ 0. If the internal state is C and if x ¼ 0 then the next state is C and y ¼ 1. If the internal state is C and if x ¼ 1 or 2 then the next state is B and y ¼ 1.

To complete the combinational circuit (Fig. 4.7a) specification it remains to choose the encoding of states A, B, and C, for example the same as before (4.3). The following case instruction defines the combinational circuit function: case q0q1 is when 00 ¼> if x1x0 ¼ 00 then q0Δq1Δ ¼ 10; elsif x1x0 ¼ 01 then q0Δq1Δ ¼ 00; elsif x1x0 ¼ 10 then q0Δq1Δ ¼ 01; else q0Δq1Δ ¼ don’t care; end if; y ¼ 1; when 01 ¼> if x1x0 ¼ 11 then q0Δq1Δ ¼ don’t care; else q0Δq1Δ ¼ 00; end if; y ¼ 0; when 10 ¼> if x1x0 ¼ 00 then q0Δq1Δ ¼ 10; elsif (x1x0 ¼ 01) or (x1x0 ¼ 10) then q0Δq1Δ ¼ 01; else q0Δq1Δ ¼ don’t care; end if; y ¼ 1; when others ¼> q0Δq1Δ ¼ don’t care; y ¼ don’t care; end case;

In this case, the value of y only depends on the current internal state. It is the so-called Moore model: the value of y only depends on the current internal state; it does not depend on the current value of the input signals x0 and x1. To summarize, two graphical description methods have been described. In both cases it is a graph whose vertices correspond to the internal states of the sequential circuit and whose directed edges are labelled with the input signal values that cause the transition from a state to another. They differ in the way that the external output signals are defined. • In the first case (Example 4.4) the external output values are a function of the internal state and of the external input values. The directed edges are labelled with both the input signal values that cause the transition and with the corresponding output signal values. It is the Mealy model. • In the second case (Example 4.5) the external output values are a function of the internal state. The directed edges are labelled with the input signal values that cause the transition and the vertices with the corresponding output signal values. It is the Moore model. Observe that a Moore model is a particular case of Mealy model in which all edges whose origin is the same internal state are labelled with the same output signal values. As an example the graph of Fig. 4.8 describes the same sequential circuit as the graph of Fig. 4.7b. Conversely it can be demonstrated that a sequential circuit defined by a Mealy model can also be defined by a Moore model but, generally, with more internal states.

86

4

Fig. 4.8 Mealy model of Fig. 4.7b

Sequential Circuits x = 1; y = 1

A x = 0, 1 or 2; y = 0 x = 0; y = 1 x = 2; y=1

C

B

x = 1 or 2; y = 1

x = 0; y = 1

Fig. 4.9 Photo of a robot vacuum cleaner (courtesy of iRobot Corporation)

4.3.2

Example of Explicit Description Generation

Given a functional specification of a sequential circuit, for example in a natural language, how can a state transition graph be defined? There is obviously no systematic and universal method to translate an informal specification to a state transition graph. It is mainly a matter of common sense and imagination. As an example, consider the circuit that controls a robot vacuum cleaner (the photo of a commercial robot is shown in Fig. 4.9). To make the example more tractable a simplified version of the robot is defined: • The robot includes a sensor that generates a binary signal OB ¼ 1 when it detects an obstacle in front of it. • The robot can execute three orders under the control of two binary inputs LR (left rotate) and RR (right rotate): move forward (LR ¼ RR ¼ 0), turn 90 to the left (LR ¼ 1, RR ¼ 0), and turn 90 to the right (LR ¼ 0, RR ¼ 1). The specification of the robot control circuit is the following: • • • •

If there is no obstacle: move forward. When an obstacle is detected: turn to the right until there is no more obstacle. The next time an obstacle is detected: turn to the left until there is no more obstacle. The next time an obstacle is detected: turn to the right until there is no more obstacle, and so on.

This behavior cannot be implemented by a combinational circuit. In order to take a decision it is not enough to know whether there is an obstacle or not; it is necessary to know the latest ordered movements:

4.3

Explicit Functional Description

87

OB = 0

OB

combinational circuit

OB = 1

RR RL

SAR RR RL = 00

OB = 1

OB = 1

OB = 0

SRL RR RL = 01

…

SRR RR RL = 10 OB = 0

q0 q1

memory

OB = 1

q0D

SAL RR RL = 00

q1D

a.

OB = 0

b.

Fig. 4.10 Robot control circuit

• If the previous command was turn to the right and if there is no obstacle then move forward. • If the previous command was turn to the right and there is still an obstacle then keep turning to the right. • If the previous command was turn to the left and if there is no obstacle then move forward. • If the previous command was turn to the left and there is still an obstacle then keep turning to the left. • If the previous command was move forward and if there is no obstacle then keep moving forward. • If the previous command was move forward and there is an obstacle and the latest rotation was to the left then turn to the right. • If the previous command was move forward and there is an obstacle and the latest rotation was to the right then turn to the left. This analysis suggests the definition of four internal states: • • • •

SAL: The robot is moving forward and the latest rotation was to the left. SAR: The robot is moving forward and the latest rotation was to the right. SRR: The robot is turning to the right. SRL: The robot is turning to the left.

With those internal states the behavior of the robot control circuit is defined by the state transition graph of Fig. 4.10b (Moore model). To define the combinational circuit of Fig. 4.10a the internal states of Fig. 4.10b must be encoded. For example: SAR : q0 q1 ¼ 00, SRR : q0 q1 ¼ 01, SAL : q0 q1 ¼ 10, SRL : q0 q1 ¼ 11: The following case instruction defines the combinational circuit function: case q0q1 is when 00 ¼> if OB ¼ 0 then q0Δq1Δ ¼ 00; else q0Δq1Δ ¼ 11; end if; RR ¼ 0; RL ¼ 0; when 01 ¼> if OB ¼ 0 then q0Δq1Δ ¼ 00; else q0Δq1Δ ¼ 01; end if; RR ¼ 1; RL ¼ 0;

ð4:4Þ

88

4

Table 4.1 Robot control circuit: next state table

Current state SAR SAR SRR SRR SAL SAL SRL SRL

Table 4.2 Robot control circuit: output table

Input: OB 0 1 0 1 0 1 0 1

Current state SAR SRR SAL SRL

Sequential Circuits

Next state SAR SRL SAR SRR SAL SRR SAL SRL

Outputs: RR RL 00 10 00 01

when 10 ¼> if OB ¼ 0 then q0Δq1Δ ¼ 10; else q0Δq1Δ ¼ 01; end if; RR ¼ 0; RL ¼ 0; when 11 ¼> if OB ¼ 0 then q0Δq1Δ ¼ 10; else q0Δq1Δ ¼ 11; end if; RR ¼ 0; RL ¼ 1; end case;

4.3.3

Next State Table and Output Table

Instead of defining the behavior of a sequential circuit with a state transition graph, another option is to use tables. Once the set of internal states is known, the specification of the circuit of Fig. 4.3 amounts to the specification of the combinational circuit, for example by means of two tables: • A table (next state table) that defines the next internal state in function of the current state and of the external input values • A table (output table) that defines the external output values in function of the current internal state (Moore model) or in function of the current internal state and of the external input values (Mealy model) As an example, the state transition diagram of Fig. 4.10b can be described by Tables 4.1 and 4.2.

4.4

Bistable Components

Bistable components such as latches and flip-flops are basic building blocks of any sequential circuit. They are used to implement the memory block of Fig. 4.3 and to synchronize the circuit operations with an external clock signal (Sect. 4.2).

4.4

Bistable Components

4.4.1

89

1-Bit Memory

A simple 1-bit memory is shown in Fig. 4.11a. It consists of two interconnected inverters. This circuit has two stable states. In Fig. 4.11b the first inverter input is equal to 0 so that the second inverter input is equal to 1 and its output is equal to 0. Thus this is a stable state. Similarly another stable state is shown in Fig. 4.11c. This circuit has the capacity to store a 1-bit data. It remains to define the way a particular stable state can be defined. To control the state of the 1-bit memory, the circuit of Fig. 4.11a is completed with two tristate buffers controlled by an external Load signal (Fig. 4.12a): • If Load ¼ 1 then the circuit of Fig. 4.12a is equivalent to the circuit of Fig. 4.12b: the input D value (0 or 1) is transmitted to the first inverter input, so that the output P is equal to NOT(D) and Q ¼ NOT(P) ¼ D; on the other hand the output of the second inverter is disconnected from the first inverter input (buffer 2 in state Z, Sect. 2.4.3). • If Load ¼ 0 then the circuit of Fig. 4.12a is equivalent to the circuits of Fig. 4.12c and of Fig. 4.11a; thus it has two stable states; the value of Q is equal to the value of D just before the transition of signal Load from 1 to 0. Observe that the two tristate buffers of Fig. 4.12a implement the same function as a 1-bit MUX2-1. The circuit of Fig. 4.12a is a D-type latch. It has two inputs: a data input D and a control input Load (sometimes called Enable). It has two outputs: Q and P ¼ Q. Its symbol is shown in Fig. 4.13. Its working can be summarized as follows: when Load ¼ 1 the value of D is sampled, and when Load ¼ 0 this sampled value remains internally stored.

Fig. 4.11 1-Bit memory

0

1

2

2

a.

D

1

b1

b2

Load

P

D

1

P

Q a.

1 0

1

2

b.

2

2

1

1

c.

P

1

2 Q

b.

Q c.

Fig. 4.12 D-type latch

Fig. 4.13 D-type latch symbol

D

Q

Load

Q

90

4

Sequential Circuits

Formally, a D-type latch could be defined as a sequential circuit with an external input D, an external output Q (plus an additional output Q), and two internal states S0 and S1. The next state table and the output table are shown in Table 4.3. However this circuit is not synchronized by an external clock signal; it is a so-called asynchronous sequential circuit. In fact, the external input Load could be considered as a clock signal input: the value of D is read and stored on each 1–0 (falling edge) of the Load signal so that the working of a D-type latch could be described by the equation QΔ ¼ D. Nevertheless, when Load ¼ 1 then Q ¼ D (transparent state) and any change of D immediately causes the same change on Q, without any type of external synchronization. Another way to control the internal state of the 1-bit memory of Fig. 4.11a is to replace the inverters by 2-input NAND or NOT gates. As an example, the circuit of Fig. 4.14a is an SR latch. It works as follows: • If S ¼ R ¼ 0 then both NOR gates are equivalent to inverters (Fig. 4.14b) and the circuit of Fig. 4.14a is equivalent to a 1-bit memory (Fig. 4.11a). • If S ¼ 1 and R ¼ 0 then the output of the first NOR is equal to 0, whatever the other input value, and the second NOR is equivalent to an inverter (Fig. 4.14c); thus Q ¼ 1. • If S ¼ 0 and R ¼ 1 then the output of the second NOR is equal to 0, whatever the other input value, and the first NOR is equivalent to an inverter (Fig. 4.14d); thus Q ¼ 0. To summarize, with S ¼ 1 and R ¼ 0 the latch is set to 1; with S ¼ 0 and R ¼ 1 the latch is reset to 0; with S ¼ R ¼ 0 the latch stores the latest written value. The combination S ¼ R ¼ 1 is not used (not allowed). The symbol of an SR latch is shown in Fig. 4.15. Table 4.3 Next state table and output table of a D-type latch Load 0 1 1 0 1 1

Current state S0 S0 S0 S1 S1 S1

S

R

a.

D – 0 1 – 0 1

Q

Q

Q

Q

b.

Next state S0 S0 S1 S1 S0 S1

Output 0 0 0 1 1 1

1

0

1

c.

0

d.

Fig. 4.14 SR latch

Fig. 4.15 Symbol of an SR latch

S

Q

R

Q

4.4

Bistable Components

91

An SR latch is an asynchronous sequential circuit. Its state can only change on a rising edge of either S or R, and the new state is defined by the following equation: QΔ ¼ S þ R Q.

4.4.2

Latches and Flip-Flops

Consider again the sequential circuit of Fig. 4.3. The following question has not yet been answered: How the memory block is implemented? It has two functions: it stores the internal state and it synchronizes the operations by periodically updating the internal state under the control of a clock signal. In Sect. 4.4.1 a 1-bit memory component has been described, namely the D-type latch. A first option is shown in Fig. 4.16. In Fig. 4.16 the memory block is made up of m D-type latches. The clock signal is used to periodically load new values within this m-bit memory. However, this circuit would generally not work correctly. The problem is that when clock ¼ 1 all latches are in transparent mode so that after a rising edge of clock the new values of q0, q1, . . ., qm1 could modify the values of q0Δ, q1Δ, . . ., qm1Δ before clock goes back to 0. To work correctly the clock pulses should be shorter than the minimum propagation time of the combinational circuit. For that reason another type of 1-bit memory element has been developed. A D-type flip-flop is a 1-bit memory element whose state can only change on a positive edge of its clock input. A possible implementation and its symbol are shown in Fig. 4.17. It consists of two D-type latches controlled by clock and NOT(clock), respectively, so that they are never in transparent Fig. 4.16 Memory block implemented with latches

x0 … xn-1

combinational circuit

q0

… y0 yk-1 …

Q

D

q0Δ

Q Load … qm-1

Q

D

qm-1Δ

Q Load clock

Fig. 4.17 D-type flip-flop d

D

Q

Load

Q

q1

D

Q

q

d

Load

Q

q

clock

D

Q

q

Q

q

b. clock

a.

92

4

Sequential Circuits

mode at the same time. When clock ¼ 0 the first latch is in transparent mode so that q1 ¼ d and the second latch stores the latest read value of q1. When clock ¼ 1 the first latch stores the latest read value of d and the second latch is in transparent mode so that q ¼ q1. Thus, the state q of the second latch is updated on the positive edge of clock. Example 4.6 Compare the circuits of Fig. 4.18a, c: with the same input signals Load and D (Fig. 4.18b, d) the output signals Q are different. In the first case (Fig. 4.18b), the latch transmits the value of D to Q as long as Load ¼ 1. In the second case (Fig. 4.18d) the flip-flop transmits the value of D to Q on the positive edges of Load. Flip-flops need more transistors than latches. As an example the flip-flop of Fig. 4.17 contains two latches. But circuits using flip-flops are much more reliable: the circuit of Fig. 4.19 works correctly even if the clock pulses are much longer than the combinational circuit propagation time; the only Fig. 4.18 Latch vs. flipflop

Load D Load

D

Q

Q

Load Q

a. b.

Load D

D

Q

Load

Q

Q

c. d. Fig. 4.19 Memory block implemented with flip-flops

x0 … xn-1

combinational circuit

q0

… y0 yk-1 …

Q

D

q0Δ

Q … qm-1

Q

D

qm-1Δ

Q clock

4.5

Synthesis Method

93

Fig. 4.20 D-type flip-flop with asynchronous inputs set and reset

reset

D

set

set Q

clock clock

Q

reset

D

a.

Q

b.

timing condition is that the clock period must be greater than the combinational circuit propagation time. For that reason flip-flops are the memory components that are used to implement the memory block of sequential circuits. Comment 4.2 D-type flip-flops can be defined as synchronized sequential circuits whose equation is QΔ ¼ D:

ð4:5aÞ

Other types of flip-flops have been developed: SR flip-flop, JK flip-flop, and T flip-flop. Their equations are QΔ ¼ S þ R Q;

ð4:5bÞ

QΔ ¼ J Q þ K Q;

ð4:5cÞ

QΔ ¼ T Q þ T Q:

ð4:5dÞ

Flip-flops are synchronous sequential circuits. Thus, (4.5a–4.5d) define the new internal state QΔ that will substitute the current value of Q on an active edge (positive or negative depending on the flip-flop type) of clock. Inputs D, S, R, J, K, and T are sometimes called synchronous inputs because their values are only taken into account on active edges of clock. Some components also have asynchronous inputs. The symbol of a D-type flip-flop with asynchronous inputs set and reset is shown in Fig. 4.20a. As long as set ¼ reset ¼ 0, it works as a synchronous circuit so that its state Q only changes on an active edge of clock according to (4.5a). However, if at some moment set ¼ 1 then, independently of the values of clock and D, Q is immediately set to 1, and if at some moment reset ¼ 1 then, independently of the values of clock and D, Q is immediately reset to 0. An example of chronogram is shown in Fig. 4.20b. Observe that the asynchronous inputs have an immediate effect on Q and have priority with respect to clock and D.

4.5

Synthesis Method

All the concepts necessary to synthesize a sequential circuit have been studied in the preceding sections. The starting point is a state transition graph or equivalent next state table and output table. Consider again the robot control system of Sect. 4.3.2. It has four internal states SAR, SRR, SRL, and SAL and it is described by the state transition graph of Fig. 4.10b or by a next state table (Table 4.1) and an output table (Table 4.2). The output signal values are defined according to the Moore model.

94 Fig. 4.21 Robot control circuit with D-type flipflops

4

OB

Sequential Circuits RR RL

combinational circuit

q0

Q

D

q0D

Q

q1

Q

D

q1D

Q clock

Table 4.4 Next state functions q1D and q0D

Table 4.5 External output functions RR and RL

Current state q1 q0 00 00 01 01 10 10 11 11

Current state q1 q0 00 01 10 11

Input: OB 0 1 0 1 0 1 0 1

Next state q1Δ q0Δ 00 11 00 01 10 01 10 11

Outputs: RR RL 00 10 00 01

The general circuit structure is shown in Fig. 4.3. In this example there is an external input OB (n ¼ 1) and two external outputs RR and RL (k ¼ 2), and the four internal states can be encoded with two variables q0 and q1 (m ¼ 2). Thus, the circuit of Fig. 4.10a is obtained. To complete the design, a first operation is to choose an encoding of the four states SAR, SRR, SRL, and SAL with two binary variables q0 and q1. Use for example the encoding of (4.4). Another decision to be taken is the structure of the memory block. This point has already been analyzed in Sect. 4.4.2. The conclusion was that the more reliable option is to use flip-flops, for example D-type flip-flops (Fig. 4.17). The circuit that implements the robot control circuit is shown in Fig. 4.21 (Fig. 4.19 with n ¼ 1, k ¼ 2, and m ¼ 2). To complete the sequential circuit design it remains to define the functions implemented by the combinational circuit. From Tables 4.1 and 4.2, from the chosen internal state encoding (4.4), and from the D-type flip-flop specification (4.5a), Tables 4.4 and 4.5 that define the combinational circuit are deduced. The implementation of a combinational circuit defined by truth tables has been studied in Chap. 2. The equations that correspond to Tables 4.4 and 4.5 are the following:

4.5

Synthesis Method

Fig. 4.22 Robot control circuit implemented with logic gates and D-type flipflops

95

RR OB

RL

q0

Q

Q

q1

D

reset

Q

Q

q0Δ

D

q1Δ

reset clock reset

D1 ¼ q1 Δ ¼ q1 : q0 : OB þ q1 : OB þ q1 : q0 , D0 ¼ q0 Δ ¼ OB, RR ¼ q1 :q0 , RL ¼ q1 :q0 : The corresponding circuit is shown in Fig. 4.22. Comments 4.3 • Flip-flops generally have two outputs Q and Q so that the internal state variables yi are available under normal and under inverted form, and no additional inverters are necessary. • An external asynchronous reset has been added. It defines the initial internal state (q1q0 ¼ 00, that is, state SAR). In many applications it is necessary to set the circuit to a known initial state. Furthermore, to test the working of a sequential circuit it is essential to know its initial state. As a second example consider the state transition graph of Fig. 4.23, in this case a Mealy model. Its next state and output tables are shown in Table 4.6. The three internal states can be encoded with two internal state variables q1 and q0, for example S0 : q1 q0 ¼ 00, S1 : q1 q0 ¼ 01, S2 : q1 q0 ¼ 10:

ð4:6Þ

The circuit structure is shown in Fig. 4.24. According to Tables 4.5 and 4.6, the combinational circuit is defined by Table 4.7. The equations that correspond to Table 4.7 are the following: q1Δ ¼ q0 þ a, q0Δ ¼ q1 q0 a, z ¼ q1 a:

96

4

Fig. 4.23 Sequential circuit defined by a state transition graph (Mealy model)

Sequential Circuits

a = 0; z = 0

S0

a = 1; z = 0 a = 1; z = 1

S1 a = 0 or 1; z = 0

S2 a = 0; z = 0

Table 4.6 Next state and output tables (Mealy model)

Current state S0 S0 S1 S1 S2 S2

a 0 1 0 1 0 1

z 0 0 0 0 0 1

Next state S0 S1 S2 S2 S2 S0

Fig. 4.24 Circuit structure a

z

combinational circuit

q0

Q

D

q0Δ

Q

q1

Q

D

Δ

q1 Q

clock

Table 4.7 Combinational circuit: truth table

4.6

Current state q1 q0 00 00 01 01 10 10 11 11

Input: a 0 1 0 1 0 1 0 1

Next state q1Δ q0Δ 00 01 10 10 10 00 – –

Output: z 0 0 0 0 0 1 – –

Sequential Components

This section deals with particular sequential circuits that are building blocks of larger circuits, namely registers, counters, and memory blocks.

4.6

Sequential Components

4.6.1

97

Registers

An n-bit register is a set of n D-type flip-flops or latches controlled by the same clock signal. They are used to store n-bit data. A register made up of n D-type flip-flops with asynchronous reset is shown in Fig. 4.25a and the corresponding symbol in Fig. 4.25b. This parallel register is a sequential circuit with 2n states encoded by an n-bit vector q ¼ qn1 qn2 . . . q0 and defined by the following equations: qΔ ¼ IN, OUT ¼ q:

ð4:7Þ

Some registers have an additional OE (Output Enable) asynchronous control input (Fig. 4.26). This permits to connect the register output to a bus without additional tristate buffers (they are included in the register). In this example the control input is active when it is equal to 0 (active-low input, Fig. 2.29). For that reason it is called OE instead of OE. Figures 4.25 and 4.26 are two examples of parallel registers. Other registers can be defined. For example: a set of n latches controlled by a load signal, instead of flip-flops, in which case OUT ¼ IN (transparent state) when load ¼ 1. Thus they should not be used within feedback loops to avoid INn-1

INn-2 D

set

Q

IN0 D

set

Q

D

set

Q reset

··· reset

Q

reset

Q

reset

clock reset

Q

IN n-bit register OUT

··· ···

OUTn-1

b.

OUTn-2

OUT0

a.

Fig. 4.25 n-Bit register INn-1

INn-2

D

set

Q

IN0

D

set

Q

D

set

Q reset

··· resetQ

resetQ

resetQ

IN n-bit register OUT

OE clock reset

··· ··· OUT

OE

BUS b.

OUTn-1

OUTn-2

OUT0 ···

a.

Fig. 4.26 n-Bit register with output enable (OE)

BUS

98

4

serial_in

D

set

D

Q

set

D

Q

set

Q

serial_in reset n-bit shift register OUT

··· reset

Q

reset

Q

reset

Q

··· ···

clock reset

OUTn-1

Sequential Circuits

b.

OUT0

OUTn-2 a.

Fig. 4.27 Shift register

Fig. 4.28 Division and multiplication by 2

0 reset clock

serial_in reset n-bit shift register

serial_in n-bit shift register reset

OUT

OUT

OUT = q

OUT = q

a.

b.

0 reset clock

unstable states. Other examples of optional configurations: clock active-low or active-high, asynchronous set or reset, and OE active-low or active-high. Shift registers are another type of commonly used sequential components. As parallel registers they consist of a set of n D-type flip-flops controlled by the same clock signal, so that they store an nbit vector, but furthermore they can shift the stored data by one position to the right (to the left) at each clock pulse. An example of shift register is shown in Fig. 4.27. It has a serial input serial_in and a parallel output OUT. At each clock pulse a new bit is inputted, the stored word is shifted by one position to the right, and the last (least significant) bit is lost. This shift register is a sequential circuit with 2n states encoded by an n-bit vector q ¼ qn-1 qn-2 . . . q0 and defined by the following equations: qn 1 Δ ¼ serial in, qi Δ ¼ qiþ1 8i ¼ 0 to n 2, OUT ¼ q:

ð4:8Þ

Shift registers have several applications. For example, assume that the current state q represents an nbit natural. Then a shift to the right with serial_in ¼ 0 (Fig. 4.28a) amounts to the integer division of q by 2, and a shift to the left with serial_in ¼ 0 (Fig. 4.28b) amounts to the multiplication of q by 2 mod 2n. For example, if n ¼ 8 and the current state q is 10010111 (151 in decimal) then after a shift to the right q ¼ 01001011 (75 ¼ b151=2c in decimal) and after a shift to the left q ¼ 00101110 (46 ¼ 302 mod 256 in decimal). There are several types of shift registers that can be classified according to (among others) • The shift direction: shift to the left, shift to the right, bidirectional shift, cyclic to the left, cyclic to the right • The input type: serial or parallel input • The output type: serial or parallel output In Fig. 4.29 a 4-bit bidirectional shift register with serial input and parallel output is shown. When L/R ¼ 0 the stored word is shifted to the left and when L/R ¼ 1 the stored word is shifted to the right.

4.6

Sequential Components IN

99

L/R

0 D

set

0 D

Q

1

set

0 D

Q

1 reset

set

0 D

Q

1

Q

reset

set

Q

1

Q

reset

Q

reset

Q

clock reset

OUT1

OUT2

OUT3

OUT0

Fig. 4.29 4-Bit bidirectional shift register with serial input and parallel output

IN3

IN2

IN1

IN0

L/S

D

set

0 D

Q

set

0 Q

D

1 reset

Q

set

0 Q

D

1 reset

Q

set

Q

1 reset

Q

reset

Q

clock reset

OUT3

OUT2

OUT1

OUT0

Fig. 4.30 4-Bit shift register with serial and parallel input and with parallel output

This shift register is a sequential circuit with 16 states encoded by a 4-bit vector q ¼ q3 q2 q1 q0 and defined by the following equations: q3Δ ¼ L=R q2 þ L=R IN, qi Δ ¼ L=R qi1 þ L=R qiþ1 8i ¼ 1 or 2, q0 Δ ¼ L=R IN þ L=R q1 , OUT ¼ q:

ð4:9Þ

Another example is shown in Fig. 4.30: a 4-bit shift register, with serial input IN3, parallel input IN ¼ (IN3, IN2, IN1, IN0), and parallel output OUT. When L/S ¼ 0 the parallel input value is loaded within the register, and when L/S ¼ 1 the stored word is shifted to the right and IN3 is stored within the most significant register bit. This shift register is a sequential circuit with 16 states encoded by a 4-bit vector q ¼ q3 q2 q1 q0 and defined by the following equations: q3 Δ ¼ IN 3 , qi Δ ¼ L=S IN i þ L=S qiþ1 8i ¼ 0, 1, 2, OUT ¼ q:

ð4:10Þ

Shift registers with other control inputs can be defined. For example: • PL (parallel load): When active, input bits INn1, INn2, . . ., IN0 are immediately loaded in parallel, independently of the clock signal (asynchronous load). • CE (clock enable): When active the clock signal is enabled; when nonactive the clock signal is disabled and, in particular, there is no shift. • OE (output enable): When equal to 0 all output buffers are enabled and when equal to 1 all output buffers are in high impedance (state Z, disconnected).

100

4

Fig. 4.31 Symbol of a shift register with control inputs PL or L/S, CE, and OE

inn-1inn-2 reset PL or L/S CE clock

in1

....

Sequential Circuits in0 serial_out

shift register

....

OE

outn-1 outn-2

....

out1 out0

register 1, parallel in, serial out

origin

…

TRANSMISSION CHANNEL

…

destination

register 2, serial in, parallel out

Fig. 4.32 Parallel-to-serial and serial-to-parallel conversion

Figure 4.31 is the symbol of a shift register with parallel input in, serial output serial_out and parallel output out, negative output enable control signal OE (to enable the parallel output), clock enable control signal (CE), and asynchronous reset input. With regard to the load and shift operations two options are considered: (1) with PL (asynchronous load): a new data is immediately stored when PL ¼ 1 and the stored data is synchronously shifted to the right on a clock pulse when CE ¼ 1; (2) with L/S (synchronous load): a new data is stored on a clock pulse when L/S ¼ 0 and CE ¼ 1, and the stored data is synchronously shifted to the right on a clock pulse when L/S ¼ 1 and CE ¼ 1. Apart from arithmetic operations (multiplication and division by 2) shift registers are used in other types of applications. One of them is the parallel-to-serial and serial-to-parallel conversion in data transmission systems. Assume that a system called “origin” must send n-bit data to another system called “destination” using for that a 1-bit transmission channel (Fig. 4.32). The solution is a parallelin serial-out shift register on the origin side and a serial-in parallel-out shift register on the destination side. To transmit a data, it is first loaded within register 1 (parallel input); then it is serially shifted out of the register 1 (serial output), it is transmitted on the 1-bit transmission channel, and it is shifted into register 2 (serial input); when all n bits have been transmitted the transmitted data is read from register 2 (parallel output). Another application of shift registers is the recognition of sequences. Consider a sequential circuit with a 1-bit input in and a 1-bit output out. It receives a continuous string of bits and must generate an output out ¼ 1 every time that the six latest received bits in(t) in(t 1) in(t 2) in(t 3) in(t 4) in(t 5) are 100101. A solution is shown in Fig. 4.33: a serial-in parallel-out shift register that stores the five values in(t 1) in(t 2) in(t 3) in(t 4) in(t 5) and generates out ¼ 1 when in(t) in (t 1) in(t 2) in(t 3) in(t 4) in(t 5) ¼ 100101. Another example is shown in Fig. 4.34. This circuit has an octal input in ¼ in2 in1 in0 and a 1-bit output out. It receives a continuous string of digits and must generate an output out ¼ 1 every time

4.6

Sequential Components

in(t) clock

D Q Q

in(t-1)

101

D Q Q

in(t-2)

D Q Q

in(t-3)

D Q Q

in(t-4)

D Q Q

in(t-5)

out

Fig. 4.33 Detection of sequence 100101

Fig. 4.34 Detection of sequence 1026

in2

D Q Q

D Q Q

D Q Q

in1

D Q Q

D Q Q

D Q Q out

in0

D Q Q

D Q Q

D Q Q

clock

that the four latest received digits in(t) in(t 1) in(t 2) in(t 3) are 1026 (in binary 001 000 010 110). The circuit of Fig. 4.34 consists of three 3-bit shift registers that detect the 1-bit sequences in2 ¼ 0001, in1 ¼ 0011, and in0 ¼ 1000, respectively. An AND3 output gate generates out ¼ 1 every time that the three sequences are detected.

4.6.2

Counters

Counters constitute another family of commonly used sequential components. An m-state counter (or mod m counter) is a Moore sequential circuit without external input and with an n-bit output q that is also its internal state and represents a natural belonging to the set {0, 1, . . ., m 1}. At each clock pulse the internal state is increased or decreased. Thus the next state equation is qΔ ¼ ðq þ 1Þ mod mðup counterÞ or qΔ ¼ ðq 1Þmod mðdown counterÞ:

ð4:11Þ

Thus counters generate cyclic sequences of states. In the case of a mod m up counter the generated sequence is . . . 0 1 . . . m 2 m 1 0 1 . . . . Definitions 4.1 • An n-bit binary up counter has m ¼ 2n states encoded according to the binary numeration system. If n ¼ 3 it generates the following sequence: 000 001 010 011 100 101 110 111 000 001 . . . . • An n-bit binary down counter has m ¼ 2n states encoded according to the binary numeration system. If n ¼ 3 it generates the following sequence: 000 111 110 101 100 011 010 001 000 111 ... . • A binary coded decimal (BCD) up counter has ten states encoded according to the binary numeration system (BCD code). It generates the following sequence: 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 0000 0001 . . . . A BCD down counter is defined in a similar way.

102

4

Fig. 4.35 n-Bit up counter

Sequential Circuits

q = qn-1qn-2··· q0 (q + cyIN) mod m

cyIN (EN)

EN reset

qΔ

reset clock

n-bit register

n-bit counter

···· q

qn-1 qn-2

a.

Fig. 4.36 n-Bit half adder

qn-1 cyOUT

q0 b.

HA qn-1Δ

qn-2 cyn-1

HA

q1 cyn-2

···

cy2

qn-2Δ

HA q1Δ

q0 cy1

HA

cyIN

q0Δ

• An n-bit Gray counter has m ¼ 2n states encoded in such a way that two successive states differ in only one position (one bit). For example, with n ¼ 3, a Gray counter sequence is 000 010 110 100 101 111 011 001 000 010 . . . . • Bidirectional counters have a control input U/D (up/down) that defines the counting direction (up or down). The general structure of a counter is a direct consequence of its definition. If m is an n-bit number, then an m-state up counter consists of an n-bit register that stores the internal state q and of a modulo m adder that computes (q + 1) mod m. In Fig. 4.35a a 1-operand adder (also called half adder) with a carry input cyIN is used. The carry input can be used as an enable (EN) control input: qΔ ¼ EN ½ðq þ 1Þmod m þ EN q:

ð4:12Þ

The corresponding symbol is shown in Fig. 4.35b. If m ¼ 2n then the mod m adder of Fig. 4.35a can be implemented by the circuit of Fig. 4.36 that consists of n 1-bit half adders. Each of them computes qi Δ ¼ qi cyi , cyiþ1 ¼ qi cyi :

ð4:13Þ

Observe that cyOUT could be used to enable another counter so as to generate a 2n-bit counter (22n ¼ m2 states) with two n-bit counters. Example 4.7 According to (4.13) with cy0 ¼ cyIN ¼ 1 the equations of a 3-bit up counter are q0 Δ ¼ q 0 1 ¼ q0 , q1 Δ ¼ q1 q0 , q2 Δ ¼ q2 q1 q0 ; to which corresponds the circuit of Fig. 4.37. Apart from reset and EN (Fig. 4.35) other control inputs can be defined, for example OE (output enable) as in the case of parallel registers (Fig. 4.26). An additional state output TC (terminal count) can also be defined: it is equal to 1 if, and only if, the current state q ¼ m 1. This signal is used to interconnect counters in series. If m ¼ 2n then (Figs. 4.36 and 4.37) TC ¼ cyOUT.

4.6

Sequential Components

103

Fig. 4.37 3-Bit up counter

q2Δ

D

q1Δ

Q

D

>

q 0Δ

Q

D

>resetQ

Q reset

Q

>resetQ

clock reset

q2

Fig. 4.38 3-Bit up counter with active-low OE and with TC

q1

q0

TC

q 2Δ

D

q1 Δ

Q

>

D

q0 Δ

Q

>resetQ

Q reset

D

Q

>resetQ

clock reset

OE

q2

Fig. 4.39 Bidirectional nbit counter

q1

q = qn-1qn-2··· q0 (q ± b_cIN) mod m

qΔ reset clock

U/D b_cIN (EN)

q0

U/D EN n-bit bidirectional counter reset ····

n-bit register

q0

qn-1 qn-2 b. q

a.

Example 4.8 In Fig. 4.38 an active-low OE control input and a TC output are added to the counter of Fig. 4.37. TC ¼ 1 when q ¼ 7 (q2 ¼ q1 ¼ q0 ¼ 1). To implement a bidirectional (up/down) counter, the adder of Fig. 4.35 is replaced by an addersubtractor (Fig. 4.39a). An U/D (up/down) control input permits to choose between addition (U/D ¼ 0) and subtraction (U/D ¼ 1). Input b_cIN is an incoming carry or borrow that can be used to enable the counter. Thus

104

4

Fig. 4.40 State transition graph of a bidirectional 3-bit counter

Sequential Circuits

000 U/D = 1

U/D = 1 U/D = 0

001 U/D = 1

U/D = 0 010 U/D = 0

U/D = 1 …

U/D = 0

U/D = 1 111

U/D = 0

inn-1inn-2

(q+cyIN) mod m

qΔ 0

reset clock

cyIN(EN)

in = inn-1 inn-2 ··· in0 1

load

in0 ····

q = qn-1 qn-2 ··· q0

load EN n-bit programmable reset counter ····

n-bit register

qn-1 qn-1 a.

q0 b.

q

Fig. 4.41 Counter with parallel load

qΔ ¼ EN U=D ½ðq þ 1Þ mod m þ EN U=D ½ðq 1Þmod m þ EN q:

ð4:14Þ

The corresponding symbol is shown in Fig. 4.39b. As an example, the state transition graph of Fig. 4.40 defines a 3-bit bidirectional counter without EN control input (b_cIN ¼ 1). In some applications it is necessary to define counters whose internal state can be loaded from an external input. Examples of applications are programmable timers and microprocessor program counters. An example of programmable counter with parallel load is shown in Fig. 4.41a. An n-bit MUX2-1 permits to write into the state register either qΔ or an external input in. If load ¼ 0 it works as an up counter, and when load ¼ 1 the next internal state is in. Thus qΔ ¼ EN load ½ðq þ 1Þmod m þ EN load in þ EN q:

ð4:15Þ

The corresponding symbol is shown in Fig. 4.41b. Comment 4.3 Control inputs reset and load permit to change the normal counter sequence: • If the counter is made up of flip-flops with set and reset inputs, an external reset command can change the internal state to any value (depending on the connection of the external reset signal to

4.6

Sequential Components

105

individual flip-flop set or reset inputs), but always the same value, and this operation is asynchronous. • The load command permits to change the internal state to any value defined by the external input in, and this operation is synchronous. In Fig. 4.42 a counter with both asynchronous and synchronous reset is shown: the reset control input sets the internal state to 0 in an asynchronous way while the synch_reset input sets the internal state to 0 in a synchronous way. Some typical applications using counters are now described. A first application is the implementation of timers. Consider an example: Synthesize a circuit with a 1-bit output z that generates a positive pulse on z every 5 s. Assume that a 1 kHz oscillator is available. The circuit is shown in Fig. 4.43. It consists of the oscillator and of a mod 5000 counter with state output TC (terminal count). The oscillator period is equal to 1 ms. Thus a mod 5000 counter generates a TC pulse every 5000 ms that is 5 s. A second application is the implementation of systems that count events. As an example a circuit that counts the number of 1s in a binary sequence is shown in Fig. 4.44a. It is assumed that the binary sequence is synchronized by a clock signal. This circuit is an up counter controlled by the same clock signal. The binary sequence is inputted to the EN control input and the counter output gives the number of 1s within the input sequence. Every time that a 1 is inputted, the counter is enabled and one unit is added to number. An example is shown in Fig. 4.44b.

Fig. 4.42 Asynchronous and synchronous reset

0

0

0 ····

load EN n-bit programmable reset counter

synch_reset reset

···· qn-1 qn-2

Fig. 4.43 Timer

reset 1kHz osc.

sequence clock

q0

reset EN n-bit counter

number a.

Fig. 4.44 Number of 1’s counter

z (T= 5 s)

mod 5000 counter TC

clock sequence

0

number

0

1

1 1

0 2 b.

1

1 3

1 4

0 5

106

4

Sequential Circuits

The 1-bit counter of Fig. 4.45a is a frequency divider. On each positive edge of in, connected to the clock input, the current value of out ¼ Q is replaced by its inverse Q (Fig. 4.45b). Thus, out is a square wave whose frequency is half the frequency of the input in frequency. A last example of application is the implementation of circuits that generate predefined sequences. For example, to implement a circuit that repeatedly generates the sequence 10010101 a 3-bit mod 8 counter and a combinational circuit that computes a 3-variable switching function out1 are used (Fig. 4.46a). Function out1 (Table 4.8) associates a bit of the desired output sequence to each counter state. Another example is given in Fig. 4.46b and Table 4.8. This circuit repeatedly generates the sequence 100101. It consists of a mod 6 counter and a combinational circuit that computes a 3-variable switching function out2 (Table 4.8). The mod 6 counter of Fig. 4.46b can be synthesized as shown in Fig. 4.35a with m ¼ 6 and cyIN ¼ 1. The combinational circuit that computes (q + 1) mod 6 is defined by the following truth table (Table 4.9) and can be synthesized using the methods proposed in Chap. 2. Fig. 4.45 Frequency divider by 2

reset reset D Q

in

in out

out

Q

b.

a.

Fig. 4.46 Sequence generators

reset

clock

reset mod 8 counter

clock

mod 6 counter

q

q

combinational circuit 1

Table 4.8 Truth tables of out1 and out2

q 000 001 010 011 100 101 110 111

combinational circuit 2

out1

out2

a.

b.

out1 1 0 0 1 0 1 0 1

out2 1 0 0 1 0 1 – –

4.6

Sequential Components

107

Table 4.9 Mod 6 addition

qΔ 001 010 011 100 101 000 -----

q 000 001 010 011 100 101 110 111

Fig. 4.47 Memory structure

Word lines

Bit lines

a1 a0

Address decoder

Address Bus

0

1

2 to every cell

3

R/W

Read/Write circuitry

d5

d4

d3

d2

d1

d0

Data Bus

4.6.3

Memories

Memories are essential components of any digital system. They have the capacity to store a large number of data. Functionally they are equivalent to a set of registers that can be accessed individually, either to write a new data or to read a previously stored data.

4.6.3.1 Types of Memories A generic memory structure is shown in Fig. 4.47. It is an array of small cells; each of them stores a bit. This array is logically organized as a set of rows, where each row stores a word. In the example of Fig. 4.47 there are four words and each of them has six bits. The selection of a particular word, either to read or to write, is done by the address inputs. In this example, to select a word among four words, two bits a1 and a0 connected to an address bus are used. An address decoder generates the row selection signals (the word lines). For example, if a1 a0 ¼ 10 (2 in decimal) then word number 2 is selected. On the other hand, the bidirectional (input/output) data are connected to the bit lines. Thus, if word number 2 is selected by the address inputs, then d5 is connected to bit number 5 of word number 2, d4 is connected to bit number 4 of word number 2, and so on. A control input R/W (read/ write) defines the operation, for example write if R/W ¼ 0 and read if R/W ¼ 1.

108

4

Sequential Circuits

Fig. 4.48 Types of memories Non-volatile

Volatile

Read/Write under special conditions

Read/Write

DRAM

SRAM

EPROM

flash

Read only

mask prog. ROM

OTP ROM

EEPROM

Storage permanence

Life of product

Mask-programmed ROM OTP ROM

EPROM EEPROM

Tens of years Battery life (10 years)

Ideal memory

In-system programmable

FLASH NVRAM

Nonvolatile

SRAM/DRAM Near zero during fabrication

external external external programmer, programmer,programmer one time thousands or in-system, of cycles thousands of cycles

external programmer or in-system, block-oriented write, thousands of cycles

in-system, fast write, unlimited cycles

Write ability

Fig. 4.49 Commercial memory types

A list of the main types of memories is given in Fig. 4.48. A first classification criterion is volatility: volatile memories lose their contents when the power supply is turned off while nonvolatile memories do not. Within nonvolatile memories there are read-only memories (ROM) and read/write memories. ROM are programmed either at manufacturing time (mask programmable ROM) or by the user but only one time (OTP ¼ one-time programmable ROM). Other nonvolatile memories can be programmed several times by the user: erasable programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), and flash memories (a block-oriented EEPROM). Volatile memories can be read and written. They are called random access memories (RAM) because the access time to a particular stored data does not depend on the particular location of the data (as a matter of fact a ROM has the same characteristic). There are two families: static RAM (SRAM) and dynamic RAM (DRAM). The diagram of Fig. 4.49 shows different types of commercial memories classified according to their storage permanence (the maximum time without loss of information) and their ability to be programmed. Some memories can be programmed within the system to which they are connected (for example a printed circuit board); others must be programmed outside the system using a device called memory programmer. Observe also that EPROM, EEPROM, and flash memories can be reprogrammed a large number of times (thousands) but not an infinite number of times. With regard to the time necessary to write a data, SRAM and DRAM are much faster than nonvolatile memories. Nonvolatile RAM (NVRAM) is a battery-powered SRAM.

4.6

Sequential Components

109

An ideal memory should have storage permanence equal to its lifetime, with the ability to be loaded as many times as necessary during its lifetime. The extreme cases are, on the one hand, mask programmable ROM that have the largest storage permanence but no reprogramming possibility and, on the other hand, static and dynamic RAM that have full programming ability.

4.6.3.2 Random Access Memories RAMs are volatile memories. They lose their contents when the power supply is turned off. Their structure is the general one of Fig. 4.47. Under the control of the R/W control input, read and write operations can be executed: if the address bus a ¼ i and R/W ¼ 1 then the data stored in word number i is transmitted to the data bus d; if the address bus a ¼ i and R/W ¼ 0 then the current value of bus d is stored in word number i. Consider a RAM with n address bits that store m-bit words (in Fig. 4.47 n ¼ 2 and m ¼ 6). Its behavior is defined by the following piece of program: i ¼ conv(a); if R/W ¼ 0 then d ¼ word(i); else word(i) ¼ d; end if;

in which conv is a conversion function that translates an n-bit vector (an address) to a natural belonging to the interval 0 to 2n 1 (this is the function of the address decoder of Fig. 4.47) and word is a vector of 2n m-bit vectors (an array). A typical SRAM cell is shown in Fig. 4.50. It consists of a D-type latch, some additional control gates, and a tristate buffer. The word line is an output of the address decoder and the bit line is connected to the data bus through the read/write circuitry (Fig. 4.47). When the word line is equal to 1 and R/W ¼ 0 then the load input is equal to 1, the tristate output buffer is in high impedance (state Z, disconnected) and the value of the bit line connected to D is stored within the latch. When the word line is equal to 1 and R/W ¼ 1 then the load input is equal to 0 and the value of Q is transmitted to the bit line through the tristate buffer. Modern SRAM chips have a capacity of up to 64 megabits. Their read time is between 10 and 100 nanoseconds, depending on their size. Their power consumption is smaller than the power consumption of DRAM chips. An example of very simple dynamic RAM (DRAM) cell is shown in Fig. 4.51a. It is made up of a very small capacitor and a transistor used as a switch. When the word line is selected, the cell capacitor is connected to the bit line through the transistor. In the case of a write operation, the bit line is connected to an external data input and the electrical charge stored in the cell capacitor is proportional to the input logic level (0 or 1). This electrical charge constitutes the stored information. However this information must be periodically refreshed because, in the contrary case, it would be

bit line

Fig. 4.50 SRAM cell

R/W

D Q load Q

word line

Sequential Circuits

bit line

4 bit line

110

pre-charge + comparison R/W

word line

Read/Write circuitry + refresh

R/W

a. b. di

Bus de datos c.

Fig. 4.51 DRAM cell

quickly lost due to the leakage currents. In the case of a read operation, the bit line is connected to a data output. The problem is that the cell capacitor is much smaller than the bit line equivalent capacitor, so that when connecting the cell capacitor to the bit line, the stored electrical charge practically disappears. Thus, the read operation is destructive. Some additional electronic circuitry is used to sense the very small voltage variations on the bit line when a read operation is executed: before the connection of the bit line to a cell capacitor, it is pre-charged with an intermediate value (between levels 0 and 1); then an analog comparator is used to sense the very small voltage variation on the bit line in order to decide whether the stored information was 0 or 1 (Fig. 4.51b). Once the stored information has been read (and thus destroyed) it is rewritten into the original memory location. The data bus interface of Fig. 4.51c includes the analog comparators (one per bit line) as well as the logic circuitry in charge of rewriting the read data. To refresh the memory contents, all memory locations are periodically read (and thus rewritten). Modern DRAM chips have a capacity of up to 2 gigabits that is a much larger capacity than SRAM chips. On the other hand they are slower than SRAM and have higher power consumption.

4.6.3.3 Read-Only Memories Within ROM a distinction must be done between mask programmable ROM whose contents are programmed at manufacturing time and programmable ROM (PROM) that can be programmed by the user, but only one time. Other names are one-time programmable (OTP) or write-once memories. Their logic structure (Fig. 4.52) is also an array of cells. Each cell may connect, or not, a word line to a bit line. In the case of a mask programmable ROM the programming consists in drawing some of the word-line-to-bit-line connections in the mask that corresponds to one of the metal levels (Sect. 7.1). In the case of a user programmable ROM, all connections are initially enabled and some of them can be disabled by the user (fuse technologies) or none of them is previously enabled and some of them can be enabled by the user (anti-fuse technologies). 4.6.3.4 Reprogrammable ROM Reprogrammable ROMs are user programmable ROM whose contents can be reprogrammed several times. Their logic structure is the same as that of non-reprogrammable ROMs but the word-line-tobit-line connections are floating-gate transistors instead of metal connections. There are three types of reprogrammable ROM: • EPROM: Their contents are erased by exposing the chip to ultraviolet (UV) radiation; for that, the chip must be removed from the system (for example the printed circuit board) in which it is used; the chip package must have a window to let the UV light reach the floating-gate transistors; an external programmer is used to (re)program the memory.

4.6

Sequential Components

111

Fig. 4.52 Read-only memory structure

word lines bit lines

address decoder

address Bus

0

a1 a0

1 2

3

read circuitry d5

d4

d3

d2

R

d1

d0

data bus

Fig. 4.53 Implementation of a 1 kB memory

R/W ME

R/W ME

256 x 4

1,024 x 8 OE

A 10 a.

OE

D 8

A

D 4

8

b.

• EEPROM: Their contents are selectively erased, one word at a time, using a specific higher voltage; the chip must not be removed from the system; the (re)programming circuitry is included within the chip. • Flash memories: This is an EEPROM-type memory with better performance; in particular, block operations instead of one-word-at-a-time operations are performed; they are used in many applications, for example pen drives, memory cards, solid-state drives, and many others.

4.6.3.5 Example of Memory Bank Memory banks implement large memories with memory chips whose capacity is smaller than the desired capacity. As an example, consider the implementation of the memory of Fig. 4.53a with a capacity of 1 kB (1024 words, 8 bits per word) using for that the memory chip of Fig. 4.53b that can store 256 4-bit words. Thus, eight memory chips must be used (1024 8 ¼ (256 4) 8). Let a9 a8 a7 . . . a0 be the address bits of the memory (Fig. 4.53a) to be implemented. The 1024word addressing space is decomposed into four blocks of 256 words, using for that bits a9 and a8. Each block of 256 words is implemented with two chips working in parallel (Fig. 4.54). To select one of the four blocks the OE (output enable) control inputs are used:

112

4

Fig. 4.54 Address space

Sequential Circuits

bits a9 0 0 … 0 0 0 … 0 1 1 … 1 1 1 … 1

a8 0 0

7

a7 a6 a5 a4 a3 a2 a1 a0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 … 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 … 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 … 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 … 1 1 1 1 1 1 1 1

0 1 1 1 0 0 0 1 1 1

R/W ME

chip 0

R/W ME

chip 1

6

bits 5

4

3

2

1

Chip 0

Chip 4

Chip 1

Chip 5

Chip 2

Chip 6

Chip 3

Chip 7

R/W ME

chip 2

0

R/W ME

chip 3

chip 0

256 x 4 OE a7 a6 ··· a0

A

256 x 4

D

OE0

OE

A

256 x 4

D

OE

OE1

A

D

OE2

256 x 4 OE

A

D

OE3 d7 d6 d5 d4

a9 a8

R/W ME

chip 4

OE4

A

D

OE OE5

A

D

R/W ME

chip 6

256 x 4

256 x 4 OE

R/W ME

chip 5

256 x 4 OE

A

D

OE6

R/W ME

chip 7

256 x 4 OE

A

D

OE7 d3 d 2 d 1 d 0

Fig. 4.55 Memory bank

OE0 ¼ OE4 ¼ a9 a8 , OE1 ¼ OE5 ¼ a9 a8 , OE2 ¼ OE6 ¼ a9 a8 , OE0 ¼ OE4 ¼ a9 a9 :

ð4:16Þ

The complete memory bank is shown in Fig. 4.55. A 2-to-4 address decoder generates the eight output enable functions (4.16). More information about memories can be found in classical books such as Weste and Harris (2010) or Rabaey et al. (2003).

4.7

Sequential Implementation of Algorithms

4.7

113

Sequential Implementation of Algorithms

In Sect. 2.8 the relation between algorithms (programming language structures) and combinational circuits has been commented. This relation exists between algorithms and digital circuits in general, not only combinational circuits. As a matter of fact the understanding and the use of this relation is a basic aspect of this course. Some systems are sequential by nature because their specification includes an explicit or implicit reference to successive time intervals. Some examples have been seen before: generation and detection of sequences (Examples 4.1 and 4.2), control of sequences of events (Sects. 4.1 and 4.3.2), data transmission (Fig. 4.32), timers (Fig. 4.43), and others. However algorithms without any time reference can also be implemented by sequential circuits.

4.7.1

A First Example

As a first example of synthesis of a sequential circuit from an algorithm, a circuit that computes the pﬃﬃﬃ integer square root of a natural is designed. Given a natural x it computes r ¼ b xc where bac stands for the greatest natural smaller than or equal to a. The following algorithm computes a set of successive pairs (r, s) such that s ¼ (r + 1)2. Algorithm 4.1 r0 ¼ 0; s0 ¼ 1; for i in 0 to N loop si+1 ¼ si + 2(ri + 1) + 1; ri+1 ¼ ri + 1; end loop;

It uses the following relation: ðr þ 2Þ2 ¼ ððr þ 1Þ þ 1Þ2 ¼ ðr þ 1Þ2 þ 2 ðr þ 1Þ þ 1 ¼ s þ 2 ðr þ 1Þ þ 1: Assume that x > 0. Initially s0 ¼ 1 x. Then execute the loop as long as si x. When si x and si+1 > x then r iþ1 2 ¼ ðr i þ 1Þ2 ¼ si x and ðr iþ1 þ 1Þ2 ¼ siþ1 > x: pﬃﬃﬃ Thus ri+1 is the greatest natural smaller than or equal to x so that r ¼ ri+1. The following naı¨ve algorithm computes r. Algorithm 4.2 Square Root r ¼ 0; s ¼ 1; while s x loop s ¼ s + 2(r+1) + 1; r ¼ r + 1; end loop; root ¼ r;

114

4

Table 4.10 Example of square root computation

r 0 1 2 3 4 5 6

0 1

s 47 true true true true true true false

s 1 4 9 16 25 36 49

x

max(1,root) max(2,root) max(3,root) max(4,root)

x

Sequential Circuits

··· loop body

loop body

loop body

loop body

···

a.

loop body

root

r s

loop body

next_r next_s

b.

Fig. 4.56 Square root implementation: iterative circuit

As an example, if x ¼ 47 the successive values of r and s are given in Table 4.10. The result is r ¼ 6 (the first value of r such that condition s x does not hold true). This is obviously not a good algorithm. The number of steps is equal to the square root r. Thus, for great values of x ﬃ 2n, the number of steps is r ﬃ 2n/2. It is used for didactic purposes. Algorithm 4.2 is a loop so that a first option could be an iterative combinational circuit (Sect. 2.8.3). In this case the number of executions of the loop body is not known in advance; it depends on a condition (s x) computed before each loop body execution. For that reason the algorithm must be slightly modified: once the condition stops holding true, the values of s and r do not change any more. Algorithm 4.3 Square Root, Version 2 r ¼ 0; s ¼ 1; loop if s x then next_s ¼ s + 2(r+1) + 1; next_r ¼ r + 1; else next_s ¼ s; next_r ¼ r; end if; s ¼ next_s; r ¼ next_r; end loop; root ¼ r;

The corresponding iterative circuit is shown in Fig. 4.56a. The iterative cell (Fig. 4.56b) executes the loop body, and the connections between adjacent cells implement the instructions s ¼ next_s and r ¼ next_r. The first output of circuit number i is either number i or the square root of x, and once the square root has been computed the first output value does not change any more and is equal to the square root r. Thus the number of cells must be equal to the maximum value of r. But this is a very large number. As an example, if n ¼ 32, then x < 232 and r < 216 ¼ 65,536 so that the number of cells must be equal to 65,535, obviously a too large number of cells.

4.7

Sequential Implementation of Algorithms

115

A better idea is a sequential implementation. It takes advantage of the fact that a sequential circuit not only implements combinational functions but also includes memory elements and, thanks to the use of a synchronization signal, permits to divide the time into time intervals to which corresponds the execution of different operations. In this example two memory elements store r and s, respectively, and the time is divided into time intervals to which are associated groups of operations: • First time interval (initial value of the memory elements): r ¼ 0; s ¼ 1;

• Second, third, fourth, . . . time interval: if s x then next_s ¼ s + 2(r+1) + 1; next_r ¼ r + 1; else next_s ¼ s; next_r ¼ r; end if;

In the following algorithm a variable end has been added; it detects the end of the square root computation. Algorithm 4.4 Square Root, Version 3 r ¼ 0; s ¼ 1; -- (initial values of memory elements) loop if s x then next_s ¼ s + 2(r+1) + 1; next_r ¼ r + 1; end ¼ FALSE; else next_s ¼ s; next_r ¼ r; end ¼ TRUE; end if; s ¼ next_s; r ¼ next_r; --synchronization end loop; root x then end is equal to true and the values of r, s, and end will not change any more. The synchronization input clock will be used to modify the value of r and s by replacing their current values by their next values next_r and next_s. The sequential circuit of Fig. 4.57 implements Algorithm 4.4. Two registers store r and s. On reset ¼ 1 the initial values are loaded: r ¼ 0, and s ¼ 1. The combinational circuit implements functions next_r, next_s and end as defined by the loop body: if s x then next_s ¼ s + 2(r+1) + 1; next_r ¼ r + 1; end ¼ FALSE; else next_s ¼ s; next_r ¼ r; end ¼ TRUE; end if;

On each clock pulse the current values of r and s are replaced by next_r and next_s, respectively.

116

4

Fig. 4.57 Square root implementation: sequential circuit

x comb. circuit

Sequential Circuits

clock reset next_r next_s

r registers s

root

end

Comment 4.4 Figure 4.57 is a sequential circuit whose internal states are all pairs {(r, s), r < 2n/2, s ¼ (r + 1)2} and whose input values are all natural x < 2n. The corresponding transition state graph would have 2n/ 2 vertices and 2n edges per vertex: for example (n ¼ 32) 65,536 vertices and 4,294,967,296 edges per vertex. Obviously the specification of this sequential circuit by means of a graph or by means of tables doesn’t make sense. This is an example of system that is described by an algorithm, not by an explicit behavior description.

4.7.2

Combinational vs. Sequential Implementation

In many cases an algorithm can be implemented either by a combinational or by a sequential circuit. In the case of the example of the preceding section (Sect. 4.7.1) the choice of a sequential circuit was quite evident. In general, what are the criteria that must be used? Conceptually any well-defined algorithm, without time references, can be implemented by a combinational circuit. For that • Execute the algorithm using all input variable value combinations and store the corresponding output variable values in a table. • Generate the corresponding combinational circuit. Except in the case of very simple digital systems, this is only a theoretical proof that a combinational circuit could be defined. The obtained truth table generally is enormous so that this method would only make sense if the designer has an unbounded space to implement the circuit (for example unbounded silicon area, Chap. 7) and has an unbounded time to develop the circuit. A better method is to directly translate the algorithm instructions to circuits as was done in Sect. 2.8, but even with this method the result might be a too large circuit as shown in the preceding section (Sect. 4.7.1). In conclusion, there are algorithms that cannot reasonably be implemented by a combinational circuit. As already mentioned above, a sequential circuit has the capacity to implement switching functions but also the capacity to store data within memory elements. Furthermore, the existence of a synchronization signal permits to divide the time into time intervals and to assign different time intervals to different operations. In particular, in the case of loop instructions, the availability of memory elements allows to substitute N identical components by one component that iteratively executes the loop body, so that space (silicon area) is replaced by time. But this method is not restricted to the case of iterations. As an example, consider the following algorithm that consists of four instructions:

4.7

Sequential Implementation of Algorithms

117 x

clock reset

step

next_step

x

f1

X1

f2

X2

f3

X3

f4

R

combinat. circuit (case)

next_R

memory elements

R

a.

b.

Fig. 4.58 Combinational vs. sequential implementations 0: X1 ¼ f1(x); 1: X2 ¼ f2(X1); 2: X3 ¼ f3(X2); 3: R ¼ f4(X3);

It can be implemented by a combinational circuit (Fig. 4.58a) made up of four components that implement functions f1, f2, f3, and f4, respectively. Assume that all data x, X1, X2, X3, and R have the same number n of bits. Then the preceding algorithm could also be implemented by a sequential circuit. For that the algorithm is modified and a new variable step, whose values belong to {0, 1, 2, 3, 4}, is added: step ¼ 0; -- (initial value of the step identifier) loop case step is when 0 ¼> R ¼ f1(x); step ¼ 1; when 1 ¼> R ¼ f2(R); step ¼ 2; when 2 ¼> R ¼ f3(R); step ¼ 3; when 3 ¼> R ¼ f4(R); step ¼ 4; when 4 ¼> step ¼ 4; end case; end loop;

The sequential circuit of Fig. 4.58b implements the modified algorithm. It has two memory elements that store R and step (encoded) and a combinational circuit defined by the following case instruction: case step is when 0 ¼> next_R ¼ f1(x); next_step ¼ 1; when 1 ¼> next_R ¼ f2(R); next_step ¼ 2; when 2 ¼> next_R ¼ f3(R); next_step ¼ 3; when 3 ¼> next_R ¼ f4(R); next_step ¼ 4; when 4 ¼> next_step ¼ 4; end case;

When reset ¼ 1 the initial value of step is set to 0. On each clock pulse R and step are replaced by next_step and next_R.

118

4

Sequential Circuits

What implementation is better? Let ci and ti be the cost and computation time of the component that computes fi. Then the cost Ccomb and computation time Tcomb of the combinational circuit are Ccomb ¼ c1 þ c2 þ c3 þ c4 and T comb ¼ t1 þ t2 þ t3 þ t4 :

ð4:17Þ

The cost Csequ and the computation time Tsequ of the sequential circuit are equal to Csequ ¼ Ccase þ Creg and T sequ ¼ 4 T clock ;

ð4:18Þ

where Ccase is the cost of the circuit that implements the combinational circuit of Fig. 4.58b, Creg is the cost of the registers that store R and step, and Tclock is the clock signal period. The combinational circuit of Fig. 4.58b is a programmable resource that, under the control of the step variable, computes f1, f2, f3, or f4. There are two extreme cases: • If no circuit part can be shared between some of those functions, then this programmable resource implicitly includes the four components of Fig. 4.58a plus some additional control circuit; its cost is greater than the sum c1 + c2 + c3 + c4 ¼ Ccomb and Tclock must be greater than the computation time of the slowest component; thus Tsequ ¼ 4Tclock is greater than Tcomb ¼ t1 + t2 + t3 + t4. • The other extreme case is when f1 ¼ f2 ¼ f3 ¼ f4 ¼ f; then the algorithm is an iteration; let c and t be the cost and computation time of the component that implements f; then Ccomb ¼ 4c, Tcomb ¼ 4t, Csequ ¼ c + Creg, and Tsequ ¼ 4Tclock where Tclock must be greater than t; if the register cost is much smaller than c and if the clock period is almost equal to t then Csequ ﬃ c ¼ Ccomb/4 and Tsequ ﬃ 4t ¼ Tcomb. In the second case (iteration) the sequential implementation is generally better than the combinational one. In other cases, it depends on the possibility to share, or not, some circuit parts between functions f1, f2, f3, and f4. In conclusion, algorithms can be implemented by circuits that consist of • Memory elements that store variables and time interval identifiers (step in the preceding example) • Combinational components that execute operations depending on the particular time interval, with operands that are internally stored variables and input signals The circuit structure is shown in Fig. 4.59. It is a sequential circuit whose internal states correspond to combinations of variable values and step identifier values.

Fig. 4.59 Structure of a sequential circuit that implements an algorithm

inputs

outputs

variables comb. step circuits

memory elements

clock reset

4.8

Finite-State Machines

4.8

119

Finite-State Machines

Finite-state machines (FSM) are algebraic models of sequential circuits. As a matter of fact the physical implementation of a finite-state machine is a sequential circuit so that part of this section repeats subjects already studied in previous sections.

4.8.1

Definition

Like a sequential circuit, a finite-state machine has input signals, output signals, and internal states. Three finite sets are defined: • Σ (input states, input alphabet) is the set of values of the input signals. • Ω (output states, output alphabet) is the set of values of the output signals. • S is the set of internal states. The working of the finite-state machine is specified by two functions f (next-state function) and h (output function): • f: S Σ ! S associates an internal state to every pair (internal state, input state). • h: S Σ ! Ω associates an output state to every pair (internal state, input state). Any sequential circuit can be modelled by a finite-state machine. Example 4.9 A 3-bit up counter, with EN (count enable) control input, can be modelled by a finitestate machine: it has one binary input EN; three binary outputs q2, q1, and q0; and eight internal states, so that Σ ¼ f0, 1g, Ω ¼ f000, 001, 010, 011, 100, 101, 110, 111g, S ¼ f0, 1, 2, 3, 4, 5, 6, 7g, f ðs, 0Þ ¼ s and f ðs, 1Þ ¼ s þ 1 mod 8, 8s 2 S, hðs, 0Þ ¼ hðs, 1Þ ¼ binary encoded ðsÞ, 8s 2 S; where binary_encoded(s) is the binary representation of s. The corresponding sequential circuit is shown in Fig. 4.60: it consists of a 3-bit register and a combinational circuit that implements f, for example a multiplexer and a circuit that computes q + 1. Fig. 4.60 3-Bit counter

reset

+1

1 0

EN

3

3

next state

current state clock

q

120

4

Sequential Circuits

Thus, finite-state machines can be seen as a formal way to specify the behavior of a sequential circuit. Nevertheless, they are mainly used to define the working of circuits that control sequences of operations rather than to describe the operations themselves. The difference between the Moore and Mealy models has already been seen before (Sect. 4.3.1). In terms of finite-state machines, the difference is the output function h definition. In the case of the Moore model h:S!Ω

ð4:19Þ

h : S Σ ! Ω:

ð4:20Þ

and in the case of the Mealy model

In the first case the corresponding circuit structure is shown in Fig. 4.61. A first combinational circuit computes the next state in function of the current state and of the input. Assume that its propagation time is equal to t1 seconds. Another combinational circuit computes the output state in function of the current state. Assume that its propagation time is equal to t2 seconds. Assume also that the input signal comes from another synchronized circuit and is stable tSUinput seconds (SU means Set Up) after the active clock edge. A chronogram of the signal values during a state transition is shown in Fig. 4.62. The register delay is assumed to be negligible so that the new current state value (register output) is stable at the beginning of the clock cycle. The output will be stable after t2 seconds. The input is stable after tSUinput seconds (tSUinput could be the value t2 of another finite-state machine). The next state will be stable tSUinput + t1 seconds later. In conclusion, the clock period must be greater than t2 and tSUinput + t1: T clock > max tSUinput þ t1 , t2 : ð4:21Þ This is an example of computation of the minimum permitted clock period and thus of the maximum clock frequency.

combinational circuit1 (t1)

next state current state

reset

input state

(tSUinput)

combinational circuit2 (t2) output state

clock

Fig. 4.61 Moore model: sequential circuit structure

Fig. 4.62 Moore model: chronogram

clock current state input state output state next state

tSUinput t2

t1

4.8

Finite-State Machines

121

combinational circuit1 (t1)

next state current state

reset

input state (tSUinput)

combinational circuit2 (t2) output state

clock

Fig. 4.63 Mealy model: sequential circuit structure

Fig. 4.64 Mealy model: chronogram

clock current state input state output state next state

tSUinput

t2 t1

The circuit structure that corresponds to the Mealy model is shown in Fig. 4.63. A first combinational circuit computes the next state in function of the current state and of the input. Assume that its propagation time is equal to t1 seconds. Another combinational circuit computes the output state in function of the current state and of the input. Assume that its propagation time is equal to t2 seconds. Assume also that the input signal comes from another synchronized circuit and is stable tSUinput seconds (SU means Set Up) after the active clock edge. A chronogram of the signal values during a state transition is shown in Fig. 4.64. As before, the register delay is assumed to be negligible so that the new current state value is stable at the beginning of the clock cycle. The input is stable after tSUinput seconds. The next state will be stable tSUinput + t1 seconds later and the output will be stable tSUinput + t2 seconds later. In conclusion, the clock period must be greater than tSUinput + t2 and tSUinput + t1: T clock > max tSUinput þ t1 , tSUinput þ t2 :

4.8.2

ð4:22Þ

VHDL Model

All along this course a formal language (pseudo-code), very similar to VHDL, has been used to describe algorithms. In this section, complete executable VHDL definitions of finite-state machines are presented. An introduction to VHDL is given in Appendix A. The structure of a Moore finite-state machine is shown in Fig. 4.61. It consists of three blocks: a combinational circuit that computes the next state, a combinational circuit that computes the output, and a register. Thus, a straightforward VHDL description consists of three processes, one for each block:

122

4

Sequential Circuits

library ieee; use ieee.std_logic_1164.all; use work.my_fsm.all; entity MooreFsm is port ( clk, reset: in std_logic; x: in std_logic_vector(N-1 downto 0); y: out std_logic_vector(M-1 downto 0) ); end MooreFsm; architecture behavior of MooreFsm is signal current_state, next_state: state; begin next_state_function: process(current_state, x) begin next_state OUT(i) ¼ A; number ¼ number + 1; when (OPERATION, i, j, k, f) ¼> X(k) ¼ f(X(i), X(j)); number ¼ number + 1; when (JUMP, N) ¼> number ¼ N; when (JUMP_POS, i, N) ¼> if X(i) > 0 then number ¼ N; else number ¼ number + 1; end if; when (JUMP_NEG, i, N) ¼> if X(i) < 0 then number ¼ N; else number ¼ number + 1; end if; end case; end loop;

5.3

Structural Specification

145

Table 5.3 Number of instructions Code and list of parameters ASSIGN_VALUE, k, A DATA_INPUT, k, j DATA_OUTPUT, i, j OUTPUT_VALUE, i, A OPERATION, i, j, k, f JUMP, N JUMP_POS, i, N JUMP_NEG, i, N

Operations Xk ¼ A Xk ¼ INj OUTi ¼ Xj OUTi ¼ A Xk ¼ f(Xi, Xj) goto N if Xi > 0 goto N if Xi < 0 goto N

Number of instructions 16 256 ¼ 4096 16 8 ¼ 128 8 16 ¼ 128 8 256 ¼ 2048 16 2 16 16 ¼ 8192 256 16 256 ¼ 4096 16 256 ¼ 4096

Comment 5.2 Table 5.2 defines eight instruction types. The number of different instructions depends on the parameter sizes. Assume that the internal memory X stores sixteen 8-bit data and that the program memory stores at most 256 instructions so that the addresses are also 8-bit vectors. Assume also that there are two different operations f. The number of instructions of each type is shown in Table 5.3: there are 16 memory elements Xi, 256 constants A, 8 input ports INi, 8 output ports OUTi, 256 addresses N, and 2 operations f. The total number is 4096 + 128 + 128 + 2048 + 8192 + 256 + 4096 + 4096 ¼ 23,040, a number greater than 214. Thus, the minimum number of bits needed to associate a different binary code to every instruction is 15.

5.3

Structural Specification

The implementation method is top-down. The first step was the definition of a functional specification (Sect. 5.2). Now, this specification will be translated to a block diagram.

5.3.1

Block Diagram

To deduce a block diagram from the functional specification (Algorithm 5.3) the following method is used: extract from the algorithm the set of processed data, the list of data transfers, and the list of data operations. The processed data are the following: • Input data: There are eight input ports INi and an input signal instruction. • Output data: There are eight output ports OUTi and an output signal number. • Internal data: There are 16 internally stored data Xi. The data transfers are the following: • Transmit the value of a memory element Xj or of a constant A to an output port. • Update number with number + 1 or with a jump address N. • Store in a memory element Xk a constant A, an input port value INj, or the result of an operation f. The operations are {f(Xi, Xj)} with all possible functions f. The proposed block diagram is shown in Fig. 5.7. It consists of five components.

146

5

IN0 IN1 ··· instruction

IN7

input selection

Synthesis of a Processor

OUT0 OUT1 ··· instruction

OUT7

output selection

(code, A, i)

(code, A, j)

to Xk Xj

register bank

instruction

{Xi}

(code, i, j, k)

Xi

instruction (code, f)

Xj

instruction (code, N)

go to

number

computation resources

f(Xi, Xj) Fig. 5.7 Block diagram

• Register bank: This component contains the set of internal memory elements {Xi}. It is a 16-word memory with a data input to Xk and two data outputs Xi and Xj. It will be implemented in such a way that within a clock period two data Xi and Xj can be read and an internal memory element Xk can be updated (written). Thus the operation Xk ¼ f(Xi, Xj) is executed in one clock cycle. This block is controlled by the instruction code and by the parameters i, j, and k. • Output selection: This component transmits to the output port OUTi the rightmost register bank output Xj or a constant A. It is controlled by the instruction code and by the parameters i and A. • Go to: It is a programmable counter: it stores the current value of number; during the execution of each instruction, it replaces number by number + 1 or by a jump address N. It is controlled by the instruction code, by the parameter N, and by the leftmost register bank output Xi whose most significant bit value (sign bit) is used in the case of conditional jumps. • Input selection: This component selects the data to be sent to the register bank, that is, an input port INj, a constant A, or the result of an operation f. It is controlled by the instruction code and by the parameters j and A. • Computation resources: This is an arithmetic unit that computes a function f with operands that are the two register bank outputs Xi and Xj. It is controlled by the instruction code and by the parameter f. The set of instruction types has already been defined (Table 5.2). All input data (INj), output data (OUTj), and internally stored data (Xi, Xj, Xk) are 8-bit vectors. It remains to define the size of the instruction parameters and the arithmetic operations: • There are eight input ports and eight output ports and the register bank stores 16 words; thus i, j, and k are 4-bit vectors. • The maximum number of instructions is 256 so that number is an 8-bit natural. With regard to the arithmetic operations, the sixteen 8-bit vectors X0 to X15 are interpreted as 2’s complement integers ((3.4) with n ¼ 8). Thus 128 Xi 127, 8i ¼ 0–15. There are two operations f: Xk ¼ (Xi + Xj) mod 256 and Xk ¼ (Xi Xj) mod 256. The instruction encoding will be defined later.

5.3

Structural Specification

147

Fig. 5.8 Input selection

IN0 IN1 ··· IN7

instruction

result

input selection

to_reg Fig. 5.9 Output selection

OUT0 OUT1 ··· OUT7 instruction

output selection

reg 5.3.2

Component Specification

Each component will be functionally described.

5.3.2.1 Input Selection To define the working of the input selection component (Fig. 5.8), extract from Algorithm 5.3 the instructions that select the data inputted to the register bank: Algorithm 5.4 Input Selection loop case instruction is when (ASSIGN_VALUE, k, A) ¼> to_reg ¼ A; when (DATA_INPUT, k, j) ¼> to_reg ¼ IN(j); when (OPERATION, i, j, k, f) ¼> to_reg ¼ result; when others ¼> to_reg ¼ don’t care; end case; end loop;

5.3.2.2 Output Selection To define the working of the output selection component (Fig. 5.9) extract from Algorithm 5.3 the instructions that select the data outputted to the output ports: Algorithm 5.5 Output Selection loop case instruction is when (DATA_OUTPUT, i, j) ¼> OUT(i) ¼ reg;

148

5

Fig. 5.10 Register bank

Synthesis of a Processor

reg_in

register bank

instruction

{Xi}

left_out

right_out

when (OUTPUT_VALUE, i, A) ¼> OUT(i) ¼ A; end case; end loop;

The output ports are registered outputs: if the executed instruction is neither DATA_OUTPUT nor OUTPUT_VALUE, or if k 6¼ i, the value of OUTk does not change.

5.3.2.3 Register Bank The register bank is a memory that stores sixteen 8-bit words (Fig. 5.10). Its working is described by the set of instructions of Algorithm 5.3 that read or write some memory elements (Xi, Xj, Xk). It is important to observe (Fig. 5.7) that in the case of the DATA_OUTPUT instruction Xj is the rightmost output of the register bank and in the case of the JUMP_POS and JUMP_NEG instructions Xi is the leftmost output of the register bank. Algorithm 5.6 Register Bank loop case instruction is when (ASSIGN_VALUE, k, A) ¼> X(k) ¼ reg_in; left_out ¼ don’t care; right_out ¼ don’t care; when (DATA_INPUT, k, j) ¼> X(k) ¼ reg_in; left_out ¼ don’t care; right_out ¼ don’t care; when (DATA_OUTPUT, i, j) ¼> right_out ¼ X(j); left_out ¼ don’t care; when (OPERATION, i, j, k, f) ¼> X(k) ¼ reg_in; left_out ¼ X(i); right_out ¼ X(j); when (JUMP_POS, i, N) ¼> left_out ¼ X(i); right_out ¼ don’t care; when (JUMP_NEG, i, N) ¼> left_out ¼ X(i); right_out ¼ don’t care;

5.3

Structural Specification

149

when others ¼> left_out ¼ don’t care; right_out ¼ don’t care; end case; end loop;

5.3.2.4 Computation Resources To define the working of the computation resources component (Fig. 5.11) extract from Algorithm 5.3 the instruction that computes f. Remember that there are only two operations: addition and difference. Algorithm 5.7 Computation Resources loop case instruction is when (OPERATION, i, j, k, f) ¼> if f ¼ addition then result ¼ (left_in + right_in) mod 256; else result ¼ (left_in - right_in) mod 256; end if; when others ¼> result ¼ don’t care; end case; end loop;

5.3.2.5 Go To This component (Fig. 5.12) is in charge of computing the address of the next instruction within the program memory: Fig. 5.11 Computation resources

left_in right_in

computation resources

instruction

result Fig. 5.12 Go to component

instruction go to

data

number

150

5

Synthesis of a Processor

Algorithm 5.8 Go To number ¼ 0; loop case instruction is when (JUMP, N) ¼> number ¼ N; when (JUMP_POS, i, N) ¼> if data > 0 then number ¼ N; else number ¼ number + 1; end if; when (JUMP_NEG, i, N) ¼> if data < 0 then number ¼ N; else number ¼ number + 1; end if; when others ¼> number ¼ number + 1; end case; end loop;

5.4

Component Implementation

The final step of this top-down implementation is the synthesis of all components that have been functionally defined in Sect. 5.3.2. Every component is implemented with logic gates, multiplexers, flip flops, and so on. A VHDL model of each component will also be generated.

5.4.1

Input Selection Component

The input and output signals of this component are shown in Fig. 5.8 and its functional specification is defined by Algorithm 5.4. It is a combinational circuit: the value of to_reg only depends on the current value of inputs instruction, IN0 to IN7, and result. Instead of inputting the complete instruction code to the component, a 2-bit input_control variable (Table 5.4) that classifies the instruction types into four categories, namely ASSIGN_VALUE, DATA_INPUT, OPERATION, and others, is defined. Once the encoding of the instructions will be defined, an instruction decoder that generates (among others) this 2-bit variable will be designed. From Algorithm 5.4 and Table 5.4 the following description is obtained: Algorithm 5.9 Input Selection Component loop case input_control is when 00 ¼> to_reg ¼ A; when 01 ¼> to_reg ¼ IN(j); when 10 ¼> to_reg ¼ result; Table 5.4 Encoded instruction types (input instructions)

Instruction type ASSIGN_VALUE DATA_INPUT OPERATION Others

input_control 00 01 10 11

5.4

Component Implementation

151

IN0 IN1

IN7 ···

IN0 IN1 ··· IN7 j A input_control

result

j

000

001

···

111

selected_port

A

result 0

input selection

input_control to_reg a.

00

01

10

11

to_reg b.

Fig. 5.13 Input selection implementation when 11 ¼> to_reg ¼ don’t care; end case; end loop;

The component inputs are input_control, A, j (parameters included in the instruction), IN0 to IN7, and result, and the component output is to_reg (Fig. 5.13a). A straightforward implementation with two multiplexers is shown in Fig. 5.13b. The following VHDL model describes the circuit of Fig. 5.13b. Its architecture consists of two processes that describe the 8-bit MUX8-1 and MUX4-1: package main_parameters is constant m: natural :¼ 8; -- m-bit processor end main_parameters; library IEEE; use IEEE.std_logic_1164.all; use work.main_parameters.all; entity input_selection is port ( IN0, IN1, IN2, IN3, IN4, IN5, IN6, IN7: in std_logic_vector(m-1 downto 0); A, result: in std_logic_vector(m-1 downto 0); j: in std_logic_vector(2 downto 0); input_control: in std_logic_vector(1 downto 0); to_reg: out std_logic_vector(m-1 downto 0) ); end input_selection; architecture structure of input_selection is signal selected_port: std_logic_vector(m-1 downto 0); begin first_mux: process(j,IN0,IN1,IN2,IN3,IN4,IN5,IN6,IN7) begin case j is when "000" ¼> selected_port selected_port selected_port selected_port to_reg to_reg to_reg to_reg ’0’); end case; end process; end structure;

5.4.2

Computation Resources

The functional specification of the computation resources component is defined by Algorithm 5.7. In fact, the working of this component when the executed instruction is not an operation (others in Algorithm 5.7) doesn’t matter. As before, instead of inputting the complete instruction to the component, a control variable f equal to 0 in the case of an addition and to 1 in the case of a subtraction will be generated by the instruction decoder. This is the component specification: Algorithm 5.10 Arithmetic Unit if f ¼ 0 then result ¼ (left_in + right_in) mod 256; else result ¼ (left_in - right_in) mod 256; end if;

This component (Fig. 5.14) is a mod 256 adder/subtractor controlled by a control input f (Fig. 3.3 with n ¼ 8 and a/s ¼ f and without the output ovf). The following VHDL model uses the IEEE arithmetic packages. All commercial synthesis tools use those packages and would generate an efficient arithmetic unit. The package main_parameters has already been defined before (Sect. 5.4.1):

Fig. 5.14 Arithmetic unit

left_in right_in

f

add/ subtract

result

5.4

Component Implementation

153

library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_arith.all; use ieee.std_logic_unsigned.all; use work.main_parameters.all; entity computation_resources is port ( left_in, right_in: in std_logic_vector(m-1 downto 0); f: in std_logic; result: out std_logic_vector(m-1 downto 0) ); end computation_resources;architecture behavior of computation_resources is begin process(f, left_in, right_in) begin if f ¼ ’0’ then result OUT(i) ¼ A; when others ¼> null; end case; end loop;

The component inputs are out_en, out_sel, A, i (parameters included in the instruction), and reg, and the component outputs are OUT0 to OUT7 (Fig. 5.15a). An implementation is shown in Fig. 5.15b. The outputs OUT0 to OUT7 are stored in eight registers, each of them with a CEN (clock enable) control input. The clock signal is not represented in Fig. 5.15b. An address decoder and Table 5.5 Encoded instruction types (output instructions)

Instruction type DATA_OUTPUT OUTPUT_VALUE Others

out_en 1 1 0

out_sel 1 0 –

154

5

OUT0 OUT1 out_en i out_sel A

Synthesis of a Processor

reg

out_en

··· OUT7 i

A

EN0 EN1

out_sel

···

···

output selection

to_ports

EN7 EN0

0

CEN

EN1

1

··· EN7

CEN

CEN

··· reg a.

OUT0

OUT1

OUT7

b.

Fig. 5.15 Output selection implementation

a set of AND2 gates generate eight signals EN0 to EN7 that enable the clock signals of the output registers. Thus, the clock of the output register number s is enabled if s ¼ i and out_en ¼ 1. The value that is stored in the selected register is either A (if out_sel ¼ 0) or reg (if out_sel ¼ 1). If out_en ¼ 0 none of the clock signals is enabled so that the value of out_sel doesn’t matter (Table 5.5). The following VHDL model describes the circuit of Fig. 5.15b. Its architecture consists of four processes that describe the 3-to-8 address decoder, the set of eight AND2 gates, the 8-bit MUX2-1, and the set of output registers. The eight outputs of the address decoder are defined as vector DEC_OUT: library ieee; use ieee.std_logic_1164.all; use work.main_parameters.all; entity output_selection is port ( A, reg: in std_logic_vector(m-1 downto 0); clk, out_en, out_sel: in std_logic; i: in std_logic_vector(2 downto 0); OUT0, OUT1, OUT2, OUT3, OUT4, OUT5, OUT6, OUT7: out std_logic_vector(m-1 downto 0) ); end output_selection; architecture structure of output_selection is signal EN: std_logic_vector(0 to 7); signal DEC_OUT: std_logic_vector(0 to 7); signal to_ports: std_logic_vector(m-1 downto 0); begin decoder: process(i) begin case i is when "000" ¼> DEC_OUT DEC_OUT DEC_OUT DEC_OUT DEC_OUT left_out, right_in ¼> right_out, f ¼> instruction(12), result ¼> result); comp3: output_selection port map (A ¼> instruction(7 downto 0), reg ¼> right_out, clk ¼> clk, out_en ¼> out_en, out_sel ¼> instruction(13), i ¼> instruction(10 downto 8), OUT0 ¼> OUT0, OUT1 ¼> OUT1, OUT2 ¼> OUT2, OUT3 ¼> OUT3, OUT4 ¼> OUT4, OUT5 ¼> OUT5, OUT6 ¼> OUT6, OUT7 ¼> OUT7); comp4: register_bank port map (reg_in ¼> reg_in, clk ¼> clk, write_reg ¼> write_reg, i ¼> instruction(11 downto 8), j ¼> instruction(7 downto 4), k ¼> instruction(3 downto 0), left_out ¼> left_out, right_out ¼> right_out); comp5: go_to port map (N ¼> instruction(7 downto 0), data ¼> left_out, clk ¼> clk, reset ¼> reset, numb_sel ¼> instruction(15 downto 12), number ¼> number); --Boolean equations: out_en IN0, IN1 ¼> IN1, IN2 ¼> IN2, . . ., instruction ¼> instruction, clk ¼> clk, reset ¼> reset, OUT0 ¼> OUT0, OUT1 ¼> OUT1, . . . , number ¼> number); digital_clock

Thank you for interesting in our services. We are a non-profit group that run this website to share documents. We need your help to maintenance this website.

To keep our site running, we need your help to cover our server cost (about $400/m), a small donation will help us a lot.