Categories
Analog Electronics Digital Electronics Engineering

The Future of Troubleshooting

If you are an engineer who regularly works with your hands, you likely troubleshoot on a daily basis. It’s just part of the job. Sure, you can say, “I never mess up!”, but hardly anyone will believe you. Because even when your best laid plans go perfectly, Murphy’s Law will soon kick in to balance things out. We learn to deal with these things and have developed tools and measurement equipment to help us diagnose and deal with these problems: Multimeters, Electrometers, SourceMeters, Oscilloscopes, Network Analyzers, Logic Analyzers, Spectrum Analyzers, Semiconductor Test equipment (ha, guess I know a little about that stuff)…the list goes on and on. But what has struck me lately has been that as parts on printed circuit boards get smaller and smaller, troubleshooting is getting…well….more troubling.

  1. Package Types — I don’t want to get into another discussion of analog vs digital, but I will say that digital parts on average have many more pins which complicates things. And as the parts get more and more complex, they require more and more pins. The industry solution was to move to a Ball Grid Array package, using tiny solder balls on the bottom of the chip that then line up with a grid of similar sized holes on the board. When you heat up the part the solderballs melt and hold the chip into place and connects all of the signals. The problem is the size of the solderballs and the connecting vias: they’re tiny. Like super tiny. Like don’t try probing the signals without a microscope and some very small probes. But wait, it’s not just the digital parts! The analog parts are getting increasingly small to accommodate any of those now-smaller-but-still-considerably-bigger-than analog parts. You thought probing a digital signal was tough before? Now try measuring something that has more than 2 possible values!
  2. Board Layers — As the parts continue on their shrink cycle, the designers using these parts also want to place them closer together (why else would they want them so small?).The circuit board designers route signals down through the different layers of insulating material so that mutiple planes can be used to route isolated signals to different points on the board. So to actually route any signals to the multitude of pins available, more and more board layers are required as the parts get smaller and closer together. Granted, parts are still mounted on either the top or bottom of the board. But if a single signal is routed from underneath a BGA package, down through the fourth layer of an 8 layer board board and then up to another BGA package, the signal will be impossible to see and measure without ripping the board apart.
  3. High Clocks — As systems are required to go faster and faster, so are their clocks. Consumers are used to seeing CPU speeds in the GHz range and others using RF devices are used to seeing even higher, into the tens of GHz. The problem arises when considering troubleshooting these high speed components. If you have a 10 GHz digital signal and you expect the waveforms to be in any way square (as opposed to sinusoidal) you need to have spectral data up to the 5th harmonic. In this case, it means you need to see 50 GHz. However, as explained with analog to digital converters in the previous post, you need to sample at twice the highest frequency you are interested in to be able to properly see all of the data. 100 GHz! I’m not saying it’s impossible, just that the equipment required to make such a measurement is very pricey (imagine how much more complicated that piece of equipment must be). High speed introduces myriad issues when attempting to troubleshoot non-working products.
  4. Massive amounts of data — When working with high speed analog and digital systems there is a good amount of data available. The intelligent system designer will be storing data at some point in the system either for debugging and troubleshooting or for the actual product (as in an embedded system). When dealing with MBs and even GBs of data streaming out of sensors and into memories or out of memories and into PCs, there are a lot of places that can glitch and cause a system failure. With newer systems processing more and more data, it will become increasingly difficult to find out what is causing the error, when it happened and how to fix it.
  5. Less Pins Available out of Packages — Even though digital packages are including more and more pins as they get increasingly complex, often times the packages cannot provide enough spare pins to do troubleshooting on a design. As other system components that connect to the original chip also get more intricate (memories, peripherals, etc), they will require more and more connections. The end result is a more powerful device with a higher pin count, but not necessarily more pins available for you the user/developer to use when debugging a design.
  6. Rework — Over a long enough time period, the production of  printed circuit boards cannot be perfect.  The question is what to do with the product once you realize the board you just constructed doesn’t work. When parts were large DIP packages or better, socketed (drop in replacements), changing out individual components was not difficult. However, as the parts continue to shrink and boards become increasingly complex to accommodate the higher pin counts, replacing the entire board sometimes becomes the most viable troubleshooting action. Environmentally this is a very poor policy. As a business, this often seems to be a decent method (if the part cost is less expensive than the labor needed to try and replace tiny components) but if and when the failures stack up, the board replacement idea quickly turns sour.

While the future of troubleshooting looks more and more difficult, there have always been solutions and providers that have popped up with new tools to assist in diagnosing and fixing a problem. In fact, much of the test and measurement industry is built around the idea that boards, parts, chips, etc are going to have problems and that there should be tools and methods to quickly find the culprit. Let’s look at some of the methods and tools available to designers today:

  1. DfX — DfX is the idea of planning for failure modes at the design stage and trying to lessen the risk of those failures happening. If you are designing a soccer ball, you would consider manufacturability of that ball when designing it (making sure the materials used aren’t super difficult to mold into a soccer ball), you would consider testability (making sure you can inflate and try out the ball as soon as it comes off the production line) and you would consider reliability (making sure your customers don’t return deflated balls 6 months down the line that cannot be repaired and must immediately be replaced). All of these considerations are pertinent to electronics design and the upfront planning can help to solve many of the above listed problems:
    1. Manufacturability — Parts that are easy to put onto the board cuts down on problem boards and possibly allows for easier removal and rework in the event of a failure. It becomes a balancing act between utilitizing available space on the board and using chips that are easier to troubleshoot.
    2. Testability — Routing important signals to a test pad on the top of a board before a design goes to the board house allows for more visibility into what is actually happening within a system (as opposed to seeing the internal system’s effect on the top level pins and outputs).
    3. Reliability — In the event you are using parts that cannot easily removed and replaced and you are forced to replace entire boards, you want to make sure your board is less likely to fail. It will save your business money and will ensure customer satisfaction.
  2. Simulation — One of the best ways to avoid problems in a design is to simulate beforehand. Simulation can help to see how a design will react to different input, perform under stressful conditions (i.e. high temperature) and in general will help to avoid many of the issues that would require troubleshooting in first place. A warning that cannot be overstated though: simulation is no replacement for the real thing. No matter how many inputs your simulation has and how well your components are modeled, no simulation can perfectly match what will happen in the real world. If you are an analog designer, simulate in SPICE to get the large problems out of the way and to figure out how different inputs will affect your product. Afterward, construct a real test version of your board or circuit and make sure your model fits your real world version. By assuming something will go wrong with the product, you will be better prepared for when it does and will be able to fix it faster.
  3. Very very steady hands — Sometimes you have to accept the fact that you messed up and the signal traces on your board and you have to rewire it somehow. My analog chip designing friends needn’t worry about trying this…chips do not have the option for re-wiring without completely reworking the silicon pathways that build the chip. In the event you do mess up and have to try and wire a BGA part to a different part of the board or jumper 0201 resistors, make sure you have a skilled technician on hand or you have very steady hands yourself. And in the event you find yourself complaining about how small the job you have to do is, think of the work that Willard Wigan does…and stop complaining.
  4. On the Chip/Board tools — Digital devices have the benefit of being stopped and started at almost any point in a program (debug). Without being able to ascertain what the real world output values are though, it doesn’t help too much. If in the event you do not Design for Test and actually pull signals you need to probe to the top level then you create a board then there are a few other options. One option is to try and read your memory locations or your processor internals directly by communicating through a debugger interface. But if you are looking at a multitude of signals and want to see exactly how the output pins look when given a certain input there is another valuable tool known as “boundary scan”. The chip or processor will accept an interface command through a specified port and then serially shift the values of the pins back out to you. Anytime you ask the chip for the exact state of all the pins, an array of ones and zeros will return which you can then decode to see which signals and pins are high or low.
  5. Expensive equipment — As mentioned above when describing an RF system measurement needs, there will always be someone who is willing to sell you the equipment you need or work to create a new solution for you. They will just charge you a ton for it. In cases I have seen where a measurement is really difficult to calculate or you need to debug a very complicated system, the specially made measurement solutions often perform great where you need them, but are severely limited outside of their scope. To use the example from before, if you needed a 100GHz oscilloscope, it is likely whomever is making it for you will deliver a product that can measure 100GHz. But if you wanted that same scope to measure 1 GHz, it would do not perform as well because it had been optimized for your specific task. However, there are exceptions to this and certain pieces of equipment sometimes seem like they can do just about anything.

Debugging is part of the job for engineers. Until you become a perfect designer it is useful to have methods and equipment for quickly figuring out what went wrong in your design. Over time you become better at knowing which signals will be critical in a design and planning on looking at those first, thereby cutting down on the time it takes to debug a product. And as you get more experience you recognize common mistakes and are sure not to design those into the product in the first place.

Do you know of any troubleshooting tools or methods that I’ve missed? What kinds of troubleshooting do you do on a daily basis? Let me know in the comments!

Categories
Analog Electronics Digital Electronics Engineering

Circuit Board Design (And How It Has Changed)

Products today mostly use Printed Circuit Boards (or PCBs) to successfully route signals from one component in a circuit to the next. There are multiple layer circuit boards with printed metal “wires” that run between the various elements in a circuit. However, this was not always the case. In the good ol’ days, there were different variations and precursors to the PCB. Some of these included point to point wiring (just soldering a wire between say a resistor and a capacitor), wire wrap boards (think of a point-to-point board on a grid with more wires than you’d know what to do with), acid etched copper on dielectric (think of a 1 layer PCB with very large and rounded signal traces) and many others. These kinds of boards had many many different methods but also had less restrictions than modern designs. In fact,  Paul Rako from EDN recently wrote a great article on prototyping using some of these older methods. He references many techniques of the greats like Bob Pease and Jim Williams and their rapid prototyping techniques. It’s an information rich article and I would highly suggest checking it out. OK, back to the party.

So what has changed when moving from older boards and circuit designs to newer circuit boards?

  1. Speed — There’ s no denying that the boards of today are faster than those of yesteryear. The extremes are apparent in the RF industry which is/was doing well because of the cell phone becoming the hottest platform to develop for (PCs are still around of course but the excitement is in the cell phone industry).  When frequencies get into the GHz range and you’re trying to guide signals instead of wire them, you know that your boards will be finicky. Additionally, the speed increase is not limited to the RF industry as many new designs have at least some component of a clocked digital system on them. Even pushing into the MHz range can be difficult with older board techniques. Wiring point to point is not as viable with high speed signals, especially when you have upwards of 32 wires between two components (a data line).
  2. Size/Type of components – This is another symptom of newer industries. As products go increasingly mobile, parts begin to shrink out of necessity or because the cost of making the older, larger parts becomes prohibitive. As such, the boards have made a large change going from through-hole components (like the capacitors in the picture at the top of this site), to Surface Mount Technology. This has affected the construction of final boards (SMT usually requires machine placement for quick and reliable boards). This also means that the amount of power a board containing only SMT parts can absorb (when the board is considered as one entity) is reduced as the smaller SMT parts cannot handle as much current without blowing up.
  3. Number of connections — I’ve included a picture of wire wrap from the Wikimedia commons site below. Notice anything about it? It’s ridiculous! And I would encourage you to go to the Wikipedia page and look into some of the other types of wire wrapped boards. Now let’s look at a common package today, the Ball Grid Array (BGA). This type of package uses little solderballs on the bottom of the package to adhere to the board. It is glued on at first and when you reflow (heat up to make the solder melt), the balls fall into place on whatever PCB you have produced (assuming you have made the PCB correctly).  BGAs start around 144 pins (maybe 196?) I believe and go to upwards of 1000 pins per part. Can you imagine trying to hook up 1000 wires like below? I don’t think so.
  4. RoHS — Lead is bad for the environment, for your health and for any children who decide to ingest it. In fact, the only people who speak the wonders of lead these days are cranky analog engineers such as myself, trying to solder something (I’m a 6 out of 10 on the cranky scale). Why do we love lead? Because Lead-Tin (Pb-Sn) solder is much easier to work with due to the lower melting temperature and higher thermal capacity.  So as RoHS becomes more widespread, with the Silver-Tin (Ag-Sn) solder that is more difficult to work with, it become another element of board design that must change.

So obviously some stuff has changed. Some is for the better, some not so much. Let’s look at board problems encountered in modern day printed circuit boards in order to see the problems encountered as circuit boards have become more inexpensive and repeatably made:

  1. Capacitance in the board — Printed circuit boards are constructed from a non-conducting material so that signals do not leak from one lead to another. However, in constructing the perfect insulator, they also created a material with a significant (but not huge) dielectric constant. This means if two signals are routed over top of one another (acting like plates), then the sandwich of the signal and the dielectric will act like a capacitor. Not only that, but as you increase the frequency of a signal (with speeds upwards of GHz), the capacitor looks more and more like a shorted wire! This phenomenon is known as “cross-talk” and can affect myriad high-speed or high voltage situations.
  2. Inductance in the leads of a chip — Before the BGAs mentioned in point 3 above, there were packages (usually square) with leads coming out the sides known as Quad Flat Packs (QFPs). The leads coming out of them vary in thickness, but usually get thinner as there are more leads on a chip. As the leads get thinner and longer, the inductance of those leads goes up. We remember that inductors are the “opposite” of capacitors in that they allow low frequency signals to pass and block high frequency signals. In a system that is mostly high frequency signals (think digital), the inductance of the leads can have a serious affect on how well a signal propogates from one element on a circuit board to the next. BGAs have started to reduce this problem, but the cost of dealing with BGAs can be quite prohibitive for smaller operations.
  3. Timing — In a high speed system that requires signals to depart a component at a certain time and arrive at a different component a short (predictable) while later, there are many things that can prevent the signal from arriving undisturbed. We’ve already seen the capacitive and the inductive effects mentioned above, but what about resistance?  Although everything has some amount of resistance, the lines in a board routing one component to the next can have an affect on the overall performance of a circuit. If one of these lines is longer than another than there will be a noticeable difference in the resistance of that line. Most importantly, when comparing the impedance (sum of the resistance and the frequency dependance of the impendance and capacitance) of two different lines going between components (say a processor and a RAM chip), differences can cause the signals to arrive at different times in different conditions. The rise times, the fall times, the over shoot, the under shoot, and the general shape of a signal can all be affected by the characterisitcs of the connection. It is useful to remember that every connection really acts like an RLC filter circuit, the only difference being how much resistance, inductance and capacitance are present and how they will affect the final signal.
  4. Ground/Power Plane — Other advantages a circuit board brings, especially multilayer circuit boards, is the ability to route a plane of power or a grounding plane underneath a portion of a design. If we think of a PCB as a large sandwich, the grounding plane would be like a slice of cheese, running underneath many of the different components of a circuit but not necessarily connected to them. If you design a circuit to have “vias” then an example component on the very top of a circuit board can connect down to the plane and access the power, ground or whatever signal happens to be running underneath there. This technique can be quite useful if you have many different op-amps in a certain area that will require positive and negative power supplies. Or if you have a large connector that requires a majority of pins to be grounded, a grounding plane can be useful to quickly connect many signals to the same net. However, as with any system, there are real-world consequences to deal with; in this case, we have to deal with electrons acting like electrons. With grounding planes, all of the pins on a board that are tied to ground will technically be at ground, however if one pin happens to have a large current going into the ground, then that area might have a slightly higher potential (voltage) than other ares of the grounding plane. This could have some definite effects in sensitive electronic situations and should be considered when designing a new PCB.
  5. Heat/Warping — A major downside to PCBs is the rigidity of the material; worse, when it heats up, it can often warp and become unusable. This could also be a problem in acid etched and wire wrap boards (the warping), but since the connections are often either larger traces or wires, the chances that the warping would break the connection are lower. Worse yet, the example above (dumping current into a ground plane) can create its own heat and warp a board without even being in a heated environment. Thermal budgets become important in any new PCB design and you should be mindful of them. Some SPICE programs even allow you to check out what the heat/power dissipation will be before putting the components on a board.
  6. Low Power — Unfortunately for high power circuit manufacturers, PCBs require extra care when they contain high voltages or high currents. Newer boards are often optimized for power savings, so high power situations are not as much of a priority for the tools that create PCBs. There are often constraints in the layout programs to ensure proper safety requirements, but other steps might be necessary, like separating high power lines from one another so they do not spark or create noise on other lower power lines.

Printed circuit boards allow for reliable products that can quickly be deployed to customers or used in a lab situation to test new circuit configurations. As long as you are mindful of the pitfalls of PCBs listed above, you can create circuit designs that can do just about anything imaginable.  If you have any suggestions on how to create better PCBs or circuits in general, please leave your thoughts in the comments.