by Lee Chin Wei
and Andrew Long
Contents
- An Overview
- Different types of Synchronous Counters
- A Comparison between Synchronous and Asynchronous Counters
- Synchronous Counter Design
- Making Fast Counters
- Referennces
The purpose of the survey is to collate information on Digital Synchronous Counters. Particular emphasis was placed on the following areas :
The material is presented in a manner suitable for a teaching tool. It seeks to enlighten and to spark off interest in the design of counters. As R.S.S Obermann remarks "....design of counters has, in my experience, always been an excellent proving ground for anyone who has mastered Boolean algebra... Have fun reading !!!!!
Binary Up Counters
A synchronous binary counter counts from 0 to 2N-1, where N is the number of bits/flip-flops in the counter. Each flip-flop is used to represent one bit. The flip-flop in the lowest-order position is complemented/toggled with every clock pulse and a flip-flop in any other position is complemented on the next clock pulse provided all the bits in the lower-order positions are equal to 1.
Take for example A4 A3 A2 A1 = 0011. On the next count, A4 A3 A2 A1 = 0100. A1, the lowest-order bit, is always complemented. A2 is complemented because all the lower-order positions (A1 only in this case) are 1's. A3 is also complemented because all the lower-order positions, A2 and A1 are 1's. But A4 is not complemented the lower-order positions, A3 A2 A1 = 011, do not give an all 1 condition.
To implment a synchronous counter, we need a flip-flop for every bit and an AND gate for every bit except the first and the last bit. The diagram below shows the implementation of a 4-bit synchronous up-counter.
1 * 15-input AND gate,
1 * 14-input AND gate,
. . .
. . .
. . .
1 * 3-input AND gate and
1 * 2-input AND gate.
This method obviously usus a lot more resources than the first method. Not only that, in the first method, the output from each flip-flop is only used as an input to one AND gate. In the second method, the output from each flip-flop is used as an input to all the higher-order bits. If we have a 12-bit counter, the output of the first flip-flop will have to drive 10 gates (called fan-out. The output from the flip-flop may not have the power to do this.
The "solution" to this is to use a compromise between the two methods. Say we have a 12-bit counter, we can organise it into 3 groups of 4. Within each group of 4, we use the second method and between the 3 groups, use the first method. This way, we only have an overall gate propagation delay and a maximum fan-out of 3 instead of 10 using the first and second method respectively.
There are many variations to the basic binary counter. The one described above is the binary up counter (counts upwards). Besides the up counter, there is the binary down counter, the binary up/down counter, binary-coded-decimal (BCD) counter etc. Any counter that counts in binary is called a binary counter.
Binary Down Counters
In a binary up counter, a particular bit, except for the first bit, toggles if all the lower-order bits are 1's. The opposite is true for binary down counters. That is, a particular bit toggles if all the lower-order bits are 0's and the first bit toggles on every pulse.
Taking an example, A4 A3 A2 A1 = 0100. On the next count, A4 A3 A2 A1 = 0011. A1, the lowest-order bit, is always complemented. A2is complemented because all the lower-order positions (A1 only in this case) are 0's. A3 is also complemented because all the lower-order positions, A2 and A1 are 0's. But A4 is not complemented the lower-order positions, A3 A2 A1 = 011, do not give an all 0 condition.
Binary Up/Down Counters
The similarities between the implementation of a binary up counter and a binary down counter leads to the possibility of a binary up/down counter, which is a binary up counter and a binary down counter combined into one. Since the difference is only in which output of the flip-flop to use, the normal output or the inverted one, we use two AND gates for each flip-flop to "choose" which of the output to use.
MOD-N/Divide-by-N Counters
Normal binary counter counts from 0 to 2N - 1, where N is the number od bits/flip-flops in the counter. In some cases, we want it to count to numbers other than 2N - 1. This can be done by allowing the counter to skip states that are normally part of the counting sequence. There are a few methods of doing this. One of the most common methods is to use the CLEAR input on the flip-flops.
Binary Coded Decimal (BCD) Counters
The BCD counter is just a special case of the MOD-N counter (N = 10). BCD counters are very commonly used because most human beings count in decimal. To make a digital clock which can tell the hour, minute and second for example, we need 3 BCD counters (for the second digit of the hour, minute and second), two MOD-6 counters (for the first digit of the minute and second), and one MOD-2 counter (for the first digit of the hour).
Ring Counters
Ring counters are implemented using shift registers. It is essentially a circulating shift register connected so that the last flip-flop shifts its value into the first flip-flop. There is usually only a single 1 circulating in the register, as long as clock pulses are applied.
The ring counter above functions as a MOD-4 counter since it has four distinct states and each flip-flop output waveform has a frequency equal to one-fourth of the clock frequency. A ring counter can be constructed for any MOD number. A MOD-N ring counter will require N flip-flops connected in the arrangement as the diagram above.
A ring counter requires more flip-flops than a binary counter for the same MOD number. For example, a MOD-8 ring counter requires 8 flip-flops while a MOD-8 binary counter only requires 3 (23 = 8). So if a ring counter is less efficient in the use of flip-flops than a binary counter, why do we still need ring counters? One main reason is because ring counters are much easier to decode. In fact, ring counters can be decoded without the use of logic gates. The decoding signal is obtained at the output of its corresponding flip-flop.
For the ring counter to operate properly, it must start with only one flip-flop in the 1 state and all the others at 0. Since it is not possible to expect the counter to come up to this state when power is first applied to the circuit, it is necessary to preset the counter to the required starting state before the clock pulses are applied. One way to do this is to apply a pulse to the PRESET input of one of the flip-flops and the CLEAR inputs of all the others. This will place a single 1 in the ring counter.
Johnson/Twisted-Ring Counters
The Johnson counter, also known as the twisted-ring counter, is exactly the same as the ring counter except that the inverted output of the last flip-flop is connected to the input of the first flip-flop.
The MOD number of a Johnson counter is twice the number of flip-flops. In the example above, three flip-flops were used to create the MOD-6 Johnson counter. So for a given MOD number, a Johnson counter requires only half the number of flip-flops needed for a ring counter. However, a Johnson counter requires decoding gates whereas a ring counter doesn't. As with the binary counter, one logic gate (AND gate) is required to decode each state, but with the Johnson counter, each gate requires only two inputs, regardless of the number of flip-flops in the counter. Note that we are comparing with the binary counter using the speed up technique discussed above. The reason for this is that for each state, two of the N flip-flops used will be in a unique combination of states. In the example above, the combination Q2 = Q1 = 0 occurs only once in the counting sequence, at the count of 0. The state 010 does not occur. Thus, an AND gate with inputs (not Q2) and (not Q2) can be used to decode for this state. The same characteristic is shared by all the other states in the sequence.
A Johnson counters represent a middle ground between ring counters and binary counters. A Johnson counter requires fewer flip-flops than a ring counter but generally more than a binary counter; it has more decoding circuitry than a ring counter but less than a binary counter. Thus, it sometimes represents a logical choice for certain applications.
Loadable/Presettable Counters
Many synchronous counters available as ICs are designed to be presettable. This means that they can be preset to any desired starting value. This can be done either asynchronously (independent of the clock signal or synchronously (on the active transition of the clock signal). This presetting operation is also known as loading, hence the name loadable counter. The diagram below shows a 3-bit asynchronously presettable synchronous up counter.
For the example above, say that P2 = 1, P1 = 0, and P0 = 1. When not(PL) is high, these inputs have no effect. The counter will perform normal count-up operations if there are clock pulses. Now let's say that not(PL) goes low at Q2 = 0, Q1 = 1 and Q0 = 0. This will produce LOW states at the CLEAR input of Q1, and the PRESET inputs of Q2 and Q0. This will make the counter go to state 101 regardless of what is occuring at the CLK input. The counter will remain at state 101 until not(PL) goes back to HIGH. The counter will then continue counting from 101.
Asynchronous counters, also known as ripple counters, are not clocked by a common pulse and hence every flip-flop in the counter changes at different times. The flip-flops in an asynchronous counter is usually clocked by the output pulse of the preceding flip-flop. The first flip-flop is clocked by an external event. A synchronous counter however, has an internal clock, and the external event is used to produce a pulse which is synchronised with this internal clock. The diagram of an ripple counter is shown below.
First of all, the asynchronous counter is slow. In a synchronous counter, all the flip-flops will change states simultaneously while for an asynchronous counter, the propagation delays of the flip-flops add together to produce the overall delay. Hence, the more bits or number of flip-flops in an asynchronous counter, the slower it will be.
Secondly, there are certain "risks" when using an asynchronous counter. In a complex system, many state changes occur on each clock edge and some ICs respond faster than others. If an external event is allowed to affect a system whenever it occurs (unsynchronised), there is a small chance that it will occur near a clock transition, after some IC's have responded, but before others have. This intermingling of transitions often causes erroneous operations. And the worse this is that these problems are difficult to forsee and test for because of the random time difference between the events.
A synchronous counter usually consists of two parts: the memory element and the combinational element. The memory element is implemented using flip-flops while the combinational element can be implemented in a number of ways. Using logic gates is the traditional method of implementing combinational logic and has been applied for decades. Since this method often results in minimum component cost for many combinational systems, it is still a popular approach. However there are other methods of implementing combinational logic which offers other advantages. Some of the alternative methods which are discussed here are: multiplexers (MUX), read-only memory (ROM) and programmable logic array (PLA).
Multiplexer
The multiplexer, also called the data selector, it has n select inputs, 2n input lines and 1 output line (and usually also a complement of the output). The 2n possible combinations of the select inputs connects one of the input lines to the output. When used as a combinational logic device, the n select inputs represent n variables and the 2n input lines represent all the minterms of the n variables.
Read-Only Memory
The ROM is usually used as a storage unit for fixed programs in a computer. However, it can also be used to implement combinational logic. It is useful for systems requiring changeable functions. When a different function is required, a different ROM producing this function can be plugged into the circuit. No wiring change is necessary. The ROM has n input lines pointing to 2n locations within the ROM that store words of M bits. As with the MUX, each input line is used to represent a variable and the 2n locations represent the minterms.
Programmable Logic Array
The PLA is very similar to the ROM. It can be thought of as a ROM with a large percentage of its locations deleted. A ROM with 16 input address lines must have 216, or 65,536 storage locations, and all the words stored in these have to be decoded. The PLA only decodes a small percentage of the minterms. The PLA is sometimes used to produce a system with a small number of chips in a minimum time.
More information on these devices are given in article 2 of cwl3.
Where speed is a concern....
In certain application, speed is an important factor affecting the choice of a counter. For example, counters used in communication and certain instrumentation applications are necessarily fast. We will be looking at some technique commonly used to improve the speed of a counter. To reinforce, the concepts presented, some commercial counters (by Xilinx) will be considered.
General Structure of a Synchronous Binary Counter
There are two common ways in which a synchronous binary counter is structured. These are, namely, the series carry synchronous counter and the parallel carry synchronous counter. These two counters are illustrated
as follows :
Both counters depicted above are binary-up counters.
The T implies a T flip-flop. The flip-flop complements/toggles its output on the rising edge of a clock pulse provided its enable (EN) input is high.
From the diagrams, it can be seen that the least significant bit Q0 toggles on every clock pulse, and subsequent bits toggle when preceding bits are high. The important distinction between the two counters is the way the EN signals propagate from Q0 to Q3. This is illustrated by the highlighted paths. The signals are propagated serially and in parallel (to each AND gate) in the first and second case respectively.
The parallel carry scheme results in a much faster counter. This difference in speed is accounted for by the delay encountered during the propagation of the EN signals. To illustrate the worst case delay in both cases, we consider a change in Q0 from 0 to 1. (see diagrams above)
In the series carry scheme, the time to propagate the change in Q0 must take into account the propagation delays of the 3 AND gates (A, B, C). In the parallel carry scheme, only the propagation delay of 1 AND gate has to be considered. Therefore, the minimum clock period of the parallel scheme is shorter. Thus, the parallel synchronous carry counter operates at a greater maximum frequency. This structure is believed to be the fastest synchronous binary counter structure. In applications that require speed, this scheme is commonly used.
This structure does have limitations. From the diagrams, it can be seen that a single flip-flop output(consider Q0) has to drive a number of subsequent AND gates. The output current of a flip-flop may not be large enough to drive that many gates. It becomes a problem when the counter gets bigger. To overcome this, a tree of AND gates is usually used. How exactly this tree will look like is an engineering choice. This choice will reflect the trade-off between speed requirements and the constraint mentioned above.
Although the series carry scheme is slower, it does not suffer the same drawback as the parallel carry scheme. This makes it a suitable basis for making big counters. Its speed can be improved by using some form of Prescaling. This technique will be considered in subsequent sections.
Prescaling
The idea of prescaling is to provide a "prescaling" stage between the incoming clock frquency and the counting circuit. The prescaling stage is sometimes provided by a dedicated prescaling device known as the Prescaler. This device/circuit is designed primarily usingEmitter Coupled Logic (ECL) . ECL benefits from very fast switching capabilities. This makes it suitable for high speed counting work.
Despite its suitability to high speed counting work, it has little or no counting features since such features will only impede its operating speed. The reader does not have to concern himself or herself with the implementation of the prescaler. The reader should, however, understand the function it performs in the overall counter.
A prescaler generates a "clock" pulse after it has received a number of input pulses. This "clock" pulse is then fed to the counting circuit. For example, a divide-by-n prescaler will generate a pulse when it has received n input pulses. At present, there are prescalers that can accept a range of frequencies ranging from a few hundred Megahertx to a few Gigahertz. The point of the prescaler is to divide an incoming clock and, thereby provide a clock to a larger, slower counting circuit.
The curious reader would probably be wondering how the actual (and faster) incoming clock frequency is actually reflected in the slower counting circuit. There are a number of ways in which a prescaler can be used, but one sophisticated setup is the "pulse swallowing"counter. The characteristic of a "pulse swallowing" counter is that it stops counting when a predetermined number of pulses has been received.
The following diagram shows a down-counting Binary Coded Decimal (BCD) counter in a simplified "pulse swallowing" setup.
In the above setup, the Tens and Units sections of a BCD counter are shown. Note that a section stops counting when zero has been reached. Consequently, a carry is also generated ( UC and TC) . Both sections are presettable via P3-P0. The outputs(Q3-Q0) reset to the preset values when Pe is high. UC is fed back to the prescaler as the Mode(M) input signal. When M is high or low, the prescaler divides-by- 10 or 11 repectively before generating a "clock" pulse. To demonstrate the principle of "pulse swallowing", let's consider an example.
Suppose we preset a value of 32 (0011 0010). The outputs will have values as shown below :
Tens Units Mode(M) Decimal Value
---------after 0 clock pulses-----------
0011 0010 0 32
---------after 11 clock pulses----------
0010 0001 0 21
---------after 22 clock pulses----------
0001 0000 1 10
---------after 32 clock pulses----------
0000 0000 1 00
----------------------------------------
Effectively, a "pulse swallowing" counter "swallows up" fast incoming clock pulses. This is reflected in the slower counter bysimultaneously driving the Tens and Units section. Therefore the net effect of such a combination (of prescaler and counter) is a counter operating at a much higher speed than what it was capable of alone.
Pipelining
Pipelining is a "predict and store" technique. It "predicts" an event one(usually) clock cycle before it is to occur. Upon prediction, certain output value(s) (resulting that from that event) are set. These new value(s) are stored/latched using flip-flops (usually D type). They appear at the outputs on the next clock pulse when the event actually occurs. How does this actually help in speeding things up?
Let's say the detection of an event and the setting of the required outputs take 20ns. The propagation of the outputs takes another 10ns.
Consider the two situations where pipelining is used and not used. If the above actions had to be performed in one clock cycle, the minimum clock period would be 30 ns(without pipelining). If these two sets of actions were performed in two separate clock periods, the minimum clock period is 20ns(with pipelining). With pipelining, the overall frequency/speed of the circuit is improved. This is illustrated schematically as follows :
Fast Counters From Xilinx
# Synchronous Presettable Counter
(Xilinx Application Notes XAPP 003.002)
Maximum Clock Frequency
8 bits : 71 MHz
16 bits : 55 Mhz
This counter demonstrates the parallel carry synchronous counter structure and the pipelining technique.
Let's consider an up-counting version of this counter.
D- preset via these inputs
On first sight, it looks complicated but the reader may have noticed that there are many similar blocks of logic circuitry. Let's take a look at some of these blocks and see how they work.
Consider block producing Q0
TERMINAL COUNT high
As seen below, an inverted version of Q0 is propagated through AND gate A. D0 is not propagated through A because an invertedversion of TERMINAL COUNT is fed into B. Therefore the output of B is low. With this setup, Q0 toggles on every rising edge of the clock pulse.
TERMINAL COUNT low
As seen below, the inverted version of Q0 is not propagated through A. D0 is propagated through B because an inverted version ofTERMINAL COUNT is fed into B. The output of the OR gate will have the value of D0. Therefore, Q0 will have the value ofD0 on the next clock pulse.It is noted that the preset value D appears as the output Q on the next clock pulse (after terminal count). This applies to all bit stages.
Consider block producing Q3
TERMINAL COUNT high
As seen below, the output of the EX-OR gate C will be propagated through A. The output of C is high when either(not both) T3 or Q3 ishigh. T3 is the ANDED version of all preceding outputs(Q0-Q2 ).(Note that in the Q1 stage, the T input is replaced by Q1) Effectively,Q3 stays the same when T3 is low. When T3 is high, Q3 toggles. Therefore, in all the bit stages, an output bit toggles when the preceding bits are high.
TERMINAL COUNT low
The preset value is loaded on the next clock pulse as before.
Consider the Carry connections
Let's focus our attention on the generation of the T outputs(see above). This counter uses an adapted version of the parallel carry scheme by employing an AND gate tree. The different outputs driving the AND gates are summarised schematically as follows :
In this setup, Q0 is fed directly to the next bit stage and in parallel to all the T(T2-T6) AND gates. This minimises the worst case delay (compare with the series carry scheme). Subsequent bits feed in parallel to the relevant AND gates. The additional gate delay introduced by Tx does not affect the critical paths from Q0 to Q7 because of the way the numbers change.
Consider the Pipelining block
When the counter output is 11111110, the NAND gate output is low. Therfore, when the value preceding terminal count is detected, the required TERMINAL COUNT value (low) is fed to the input of the flip-flop. On the next clock pulse(when it is terminal count),TERMINAL COUNT is low ("load preset value"). This propagates the preset values (D0-D7) to the inputs of the flip-flops. Note : when any other values are detected, the NAND gate output is high. Thus TERMINAL COUNT is high ("do not load preset value").
# High-Speed Synchronous Prescaler Counter
( Xilinx Application Notes 001.002)
Max Clock Frequency
8 bits : 200 MHz
16 bits : 115 MHz
The counter demonstrates prescaling and pipelining.
The counter can be represented in a block diagram as follows :
The counter employs the concept of prescaling but does not use a dedicated prescaler (ECL device). Instead, the least significant (LS) tri-bit(Q0-Q2 ) provides the prescaling function. All tri-bits respond (increment) to a clock pulse if its Count-Enable inputs (CE, CEP ,CET) are high. The CEO of the LS tri-bit is high once in every 8 clock cycles when all its outputs are high. The "prescaler" pulse effectively reduces the clock rate to the rest of the tri-bits by a factor of 8. Note that there is no change in the original clock rate. The 7clock cycles when the LS CEO is low gives the CEO-CET ripple chain (of subsequent tri-bits) time to settle. If this prescaling was not done, the settling time would have to be taken into account when determining the minimum clock period of the counter. This would significantly limit the minimum clock period, thereby slowing the counter down. This would become clearer when we examine the actual implementation of this counter.
Note : all clock inputs are assumed to be driven by a common clock. Qa and Qc represent the LSB and MSB of a tri-bit respectively.
CEOs are high only when CEs(or CET s) and the outputs Q(Qa-Qc)are high.
When the Count-Enable inputs are high, the EX-OR gate complements Qa. Therefore, the Count-Enable inputs effectively "enable" or "disable" the complementing function.
When both Qa and the Count-Enable inputs are high, the complementing function is enabled.
Generation of Qc
The generation of Qc is similar to the generation of Qb except that the value of Qb is also fed into the AND gate.
The speed of the counter can be improved further by pipeliningthe LS CEO signal :
We see that when 110 is detected, CEO is set and fed to the flip-flop input. This value appears as CEO on the next clock pulse(when 111 occurs).
The actual implemetation of LS tri-bit with pipelining is seen below :
# Ultra-Fast Synchronous Counter
(Xilinx Application Notes XAPP 014.001)
Maximum Clock Frequency
8 bits : 256MHz
16 bits : 108MHz
In the previous example, the distribution of the CEO signal (from the LS to the MS tri-bit) introduces a line transmission delay. This counter eliminates the delay by replicating QO for bits after Q1. This is done by the following chain/network of flip-flops :
To best describe the function of such a network, let's take a look at the timing diagram depicting the output values :
Here Q1 and Q2 act as the second "prescaler". This additional prescaler is needed to accommodate a large counter(more bits). The effective "clock" rate provided by this prescaler to the rest of the counter is 1/8 of the actual clock rate. This second prescaling stage allows the rest of the counting circuit to employ the series carry scheme . The use of such a carry scheme allows a larger counter to be constructed.
From the diagram, CEP2 is pipelined. When the LS three bits are 101(Q2-Q0), the output of AND gate A is high. Since Q0 is high at this point, the value of A is selected and appears at the output of multiplexer B. This value is fed to the Flip-flop input. On the next clock cycle, the value appears as CEP2(high). At this point, the LS three bits is 110. Since QY01 is low, CEP2 is selected by the multiplexer. Thus, on the next clock cycle, CEP2 is high again. This is summarised below :
Q2 Q2 Q1 A D-input(flip-flop) CEP2
1 0 0 1 0 0
1 0 1 1 1 0
1 1 0 0 1 1
1 1 1 0 0 1
Excellent | Good | Fair | Poor | ||||||
1. | Title: | Counting and Counters | |
Author(s): | R M M Oberman | ||
Source: | The Macmillan Press Ltd | ||
2. | Title: | Electronic Counters | |
Author(s): | R M M Oberman | ||
Source: | The Macmillan Press Ltd | ||
Type: | Usefulness: | Readability: | |||
3. | Title: | Logic Design Principles | |
Author(s): | Edward J. McCluskey | ||
Source: | Prentice Hall International | ||
Type: | Usefulness: | Readability: | |||
4. | Title: | Digital System Design | |
Author(s): | Barry Wilkinson with Rafic Makki | ||
Source: | Prentice Hall International | ||
Type: | Usefulness: | 1/2 | Readability: | ||
5. | Title: | Digital Design : Principles and Practices | |
Author(s): | John F. Wakerly | ||
Source: | Prentice Hall International | ||
Type: | Usefulness: | Readability: | 1/2 |
6. | Title: | Digital Systems : Principles and Practices | |
Author(s): | Ronald J. Tocci | ||
Source: | Prentice Hall International | ||
Type: | Usefulness: | Readability: | |||
7. | Title: | Practical Digital Design Using ICs | |
Author(s): | Joseph D. Greenfield | ||
Source: | John Wiley & Sons | ||
Type: | Usefulness: | Readability: | |||
8. | Title: | Digital Logic Design | |
Author(s): | Brian Holdsworth | ||
Source: | Butterworth-Heinemann Ltd | ||
Type: | Usefulness: | 1/2 | Readability: | ||
9. | Title: | Digital Design | |
Author(s): | M. Morris Mano | ||
Source: | Prentice Hall International> | ||
Type: | Usefulness: | 1/2 | Readability: | ||
10. | Title: | Logic Design Principles | |
Author(s): | Edward J. McCluskey | ||
Source: | Prentice Hall International | ||
Type: | Usefulness: | Readability: | |||
11. | Title: | Digital Electronics | |
Author(s): | Christopher E. Strangio | ||
Source: | Prentice Hall International | ||
Type: | Usefulness: | Readability: | |||
12. | Title: | Digital Logic and State Machine Design | |
Author(s): | David J. Comer | ||
Source: | Saunders College Publishing | ||
Type: | Usefulness: | Readability: | |||
13. | Title: | Digital Circuits and Microprocessors | |
Author(s): | Herbert Taub | ||
Source: | Prentice Hall International | ||
Type: | Usefulness: | Readability: | |||
14. | Title: | The Programmable Logic Data Book | |
Author(s): | Xilinx Inc. | ||
Source: | Xilinx Inc. | ||
Type: | Usefulness: | Readability: | |||
15. | Title: | Digital Logic and Computer Design | |
Author(s): | M. Morris Mano | ||
Source: | Prentice Hall International | ||
Type: | Usefulness: | Readability: | |||
16. | Title: | ISE Second Year Digital Electronics Notes | |
Author(s): | Mike Brookes | ||
Source: | Mike Brookes | ||
Type: | Usefulness: | Readability: | |||
沒有留言:
張貼留言