11/17/2015, Chugging along

Boy, that is probably the dullest title of all the blog posts, so far, but, well, there it is.  Basically I’m converting the code from the old processor to code on the new processor, updating some of it since it is being changed from an 8 bit processor to a 32 bit processor.  The change in processor size makes some things easier and some things more difficult, but everything has to be revisited.  The ports are set up more coherently on this processor which simplifies a lot of the code.

I have currently converted the processing for the input boards and the solenoid drivers.  I had already done the Neopixel stuff, but need to use interrupts for the Neopixels so it doesn’t affect the switch/solenoid processing as much.  (It’s just the right thing to do).  I’ve added code to save the configuration off to flash memory, and retrieve it so the configuration is persistent.  Error checking has been added to the non-volatile configuration so that it is guaranteed to be valid.  Saving the configuration in flash allows the boards to be run without being tethered to a computer.  (AKA Joe/Cactus Jack configuration, and well the configuration I use most often when wiring up a playfield and batting around the ball).

The last big thing that I need to program before starting testing is the serial communication.  Should be a pretty simple module, just haven’t done it yet.  I need to merge all the commands from the input and solenoid driver into a single file.  At that point, after testing, it should be ready for the next step.

So I need some way to test the boards.  I’ll probably start out updating the PinBrdGUI python application to support Gen2 cards.  That will test the basic functionality.  Then I will probably hook the boards up to the old playfield that I used for original testing of the first generation of boards.  (I can’t even remember the name of that playfield at this moment)   It allows me to shoot the ball around, look at inputs, etc.  It’s not a very exciting playfield, but it is good enough for testing.

So the eventual destination of these boards is in a Dolly Parton machine that is going to be rethemed.  The goal is to not change any of the wiring harnesses and hook the boards directly up to the Dolly playfield.  (This excludes the sound and displays, because those are going to upgraded to an LCD display and full stereo music much like SS3.)  I’ll probably spin a board to do the interconnections and why not, it’s cheap.  It means that I will need to completely understand the wiring harness of the Dolly machine, but that shouldn’t be too bad.

The one thing that needs to be supported is a switch matrix and a lamp matrix.  I’ve already figured out how to do the switch matrix in the code.  It will need a couple more serial commands, but they are all based on the input board.  It will simply look like 8 output drivers to power the column strobe, and 8 input drivers to read each of the signals.  It will then send all 64 bits of data (8 bytes) back as a single command.

I haven’t figured out how I’m going to do the lamp matrix yet, but it may require another new board, or I might be able to combine two incandescent driver boards.  Just haven’t taken the time to do the design work yet.

That’s all I have for today.

10/16/2015, Interface document updated

Been porking the pooch lately on pinball.  Don’t really have much to show for my efforts, but alas, there is a new version of the interface document for the Gen 2 hardware.  Since I believe in interface based design, writing up the interface specification is the first major step to actually talking to the boards and supporting them.  With this awesome document, somebody could write an interface to the Mission Pinball Framework (MPF).

Here is a quick link to the document:


The Gen 2 boards add a CRC8 to most command/responses to watch for serial interface errors.  Neopixel command and support have been added.  Most of the original commands are unchanged except for adding the CRC8 at the end of the command/response.  New commands have also been added to write the wing board configuration.

If browsing the code in the source forge repository, the trunk/Docs folder has all of the documentation.  A couple people have mentioned that they hadn’t seen the documentation.  Part of this stems from the fact that it was moved from a Google repository to a source forge repository, and there is now and empty folder called Docs at the head of the repository.

If I don’t publish this now, I will run into the fact that I’ll just have to add it to the next entry.  Sorry for the lack of content.

9/25/2015, Neopixel Eye Candy

Looks like this weekend is the time to try and hookup the Neopixels.  There have been a bunch of small changes that I ended up making to get things working as efficiently as possible.

So I got a little over zealous in the belief that I had infinite processing power.   The SPI bus is running at 2.4 MHz, so each bit is 417 ns.  8 * 417ns =  3.33us.  Hmmm, that’s pretty darn fast come to think of it.  Assuming 1.5 assembly instructions/clock (RISC based processor with most stuff 1 cycle, but  loading, storing, and pushing info onto the stack takes more cycles.  Branches are particularly long since the pipeline may need to be flushed), that means the processor has about 53 instructions.  In a tight loop the processor was not able to keep the FIFO filled.

The processor is only running at 24MHz (the default), so let’s kick that up to 48MHz which is what the processor can run.  That made it so the processor could keep the FIFO full without an issue.  The dream of dynamically determining pixel color is probably not possible without hand coding assembly.  It is much simpler to allocate the required memory at initialization, and update the RAM on the periodic timer.

Kept thinking about what else could be done.  The TX FIFO for the SCB can be changed to 16 bits wide instead of 8 bits wide.  This is somewhat annoying since each pixel takes 9 bytes of data, and now pixels must cross a FIFO write boundary.  Last optimization is using the TX FIFO level to kick off the processing.  If the level is set to 4, there are 5 16 bit pieces of data (4 TX FIFO slots, plus the shift register itself) of interrupt latency that can be tolerated.  That is about 33.33us of interrupt latency that can be handled without underflowing the FIFO and not properly updating the chain of Neopixels.

So for 64 Neopixels, the amount of RAM required is (64 * 9) (RAM for holding data sent on SPI bus) + 64 (holds current command for each Neopixel) = 640 bytes.  The processor has 8K, so only using half the RAM for this allows about 400 Neopixels per board.  Of course, there is no reason that you couldn’t put multiple boards into the machine to support more, but well, that seems ludicrous.

Here is a quick picture of the demo systems setup:

Neopixel Demo

Quick video of the Neopixels working:

There is something weird going on with the first Neopixel.  It might be that I got it a little too hot when I soldered the wires to the strip.  (I didn’t “tin” the pads before soldering on the leads, which meant I had to use a lot more heat).  It could also be a bug in the program, and I should probably send a blank 16 byte word before sending the first bit of data.  At this point, it is good enough to hand off.

I’m getting pretty familiar with the processor at this point, so it is getting faster for me to add new features.  I really have to work on the next serial commands to interface with the next generation of the boards.

9/17/2015, Neopixels, Get Off My Back

So multiple people have asked me about Neopixels and do I support them.  I was so busy looking at other stuff, I simply did not have the bandwidth to look into Neopixels.  As the work on SS3 was winding down this Summer, and I started to look into the next generation boards, I thought that I should do some research to make sure that I didn’t preclude their use.  In July I did read about them, and even included a wing board to make interfacing with them as simple as possible.

I was originally hesitant about Neopixels because it adds one more layer of difficulty to programming a pinball machine.  Right now, the OPP hardware only supports turning on and off lights (either incandescent or LED bulbs).  The framework supports turning the bulbs on or off, or blinking the lights slowly or rapidly.  The blinking happens automatically in the framework so the user doesn’t have to bother with changing the lights all the time.  If I switch to using Neopixels, they are not simply on/off, but now they can support different colors which will be even more difficult to use.

I spent a couple of days trying to figure out the best way to command a “smart” Neopixel controller.  One would be to continuously send the color for each pixel, and that would provide all the features.  That would use a lot of the bandwidth of the serial links between the boards and is very inefficient.

Instead, I decided to use a byte for each of the pixels.  The byte contains a command (turn pixel on, blink quickly, blink slowly, fade quickly, or fade slowly), and a color index which looks up the pixel’s color using a color table.  The color table contains 32 possible colors, and of course there will be commands to change the color table as necessary.  (Each entry in the color table contains the 8 bits for red, green and blue parts of the color).  The blink commands simply turn the pixels on and off with the chosen color in a synchronized fashion.  The fade commands go from dark to bright, then back to dark to allow the pixels to pulse.

I’m planning on trying to get the Neopixel code up and running on the PSoC 4200 board this weekend or next.  (It really depends on how quickly I can bring up the debugger).  I’ll use short button presses to change the color, and long button presses to change the commands to demonstrate that it is working properly.

Here is a concrete example why using the simple generic code that Creator provides will not work and why such code should be avoided.  To talk to the Neopixels, the processor uses the SCB SPI bus component to stream the data.  The SPI bus will be running at 2.4 MHz to meet Neopixel timing specs (3 SPI bits for every 1 Neopixel bit, matching the Lady Ada Uberguide specs).  Every 20 ms the Neopixels will be refreshed.  (This is so things such as automatic blinking and fading can happen automatically).  That boils down to 9 bytes of data/neopixel * 64 max neopixels = 576 bytes of data/20 ms.  (It will support more Neopixels, but 64 seems like a good starting point).  The standard functions created by the tool are blocking calls.  Blocking calls wait in a busy loop if they can’t put all the data on the Tx buffer.  If a solenoid needed to be fired at that time, it couldn’t happen because the processor would be busy updating the Neopixels.  Not very efficient at all.  Instead, the code will watch for the FIFO empty interrupt and when that happens toss another 8 bytes of info onto the FIFO, and then go back to normal processing.  This means that the processing of the Neopixel data is distributed over the 20 ms time period, and latencies for other processing will be reduced.

So why is the name of this post “Get Off My Back.”  I’m hoping that either this weekend or next weekend I will have the code up and running, and be able to hand off a demo unit to allow Dave to play with it and see its capabilities.  He currently makes replacement backbox lighting mods for Stern machines and does a really nice job.  This should give him a very low cost solution so he will be able to make his backbox mods that much more exciting.

8/31/2015, PSOC 4200, The Good, The Bad, and The Ugly

Nice name for the entry seeing as though it is already ten days after I started writing this post.   This entry is going to be mostly on bringing up an embedded processor from scratch.

So I started working with the PSoC 4200 over the last couple of weeks.  After going down a good number of rabbit holes, and not being able to figure out if certain tools would work together, I decided to simply suck it up and use the Cypress Creator software to start.  The bonus of that is that within a couple of hours I was able to download the example project, and burn it onto the board and run it.  I then made a small change, recompiled and threw that down onto the board to make certain that I was actually programming the board successfully.

When I bring up a processor from scratch, I like to follow a pretty rigorous path.  It basically moves from the absolutely easiest stuff to the more difficult stuff.  Here is the order that I tend to tackle the projects, or embedded programming 101:

  1. Blink an LED.  Almost every board has an LED on it, and if not throw down a resistor and an LED on one of the output pins.  The first incarnation uses a loop and a counter to blink the LED on and off.  I then modify the counter in a loop to make it blink either faster or slower to prove to myself that I am altering the code successfully.  (At this point, I hook up a debugger and make sure that I can view and step through the code to make sure that I have the debugger set up properly.  Currently, I don’t have a debugger, but I’m hoping to grab one in the next few weeks).
  2. Blink the LED using a timer to change between the LED on and the LED off.  This proves that I understand configuring the timers, and understand the clocks within the chip.  It is very easy to accidentally miss clock divider, or be off by a little bit.  This step also requires understanding clock routing within the chip.
  3. Use the timer to cause an interrupt and change the LED on/off state in an interrupt.  Requires understanding of interrupts, interrupt vector table, and how to clear the interrupt sources.
  4. Transmit words continuously on the UART interface.  Make sure that I have the baud rate set up properly for the UART, and can see the data coming back on a PC.
  5. Echo received characters on the UART interface, back to the transmit interface.  When a type an ‘x’ on the keyboard, I should see that echoed back.

After finishing those five simple steps I’m usually familiar enough with the processor, that every thing else is simply reading documentation and digging through registers.   Doing the above steps forces the programmer to read and understand the documentation since every company documents their chip in very different ways.  Some companies have a single 2000 page programming reference guide, while other companies break each hardware subsection into a different document.

So one of the absurd things with the Cypress Creator IDE is that it tries to force the programmer into using their canned components as opposed to actually programming the processor.  If a UART is needed, drag and drop it into the fake processor schematic and fill in a couple of fields.  No need to understand the registers that actually are underlying the component.  Same thing with I/O pins, SPI buses, etc.  The down side to that is the code becomes very large, because it requires all of these generic components where most of the functionality is not being used.

Here is the actual code C code that blinks the LED on the board:

typedef volatile unsigned long R32;
typedef unsigned long U32;

#define GPIO_PRT1_DR        0x40040100
#define GPIO_PRT1_PS        0x40040104
#define GPIO_PRT1_PC        0x40040108
#define GPIO_PRT1_INTR_CFG  0x4004010c
#define GPIO_PRT1_INTR      0x40040110
#define GPIO_PRT1_PC2       0x40040114
#define HSIOM_PORT_SEL1     0x40010004

int main()
   /* Initialization code */
   U32 count = 0;

   *((R32 *)GPIO_PRT1_DR) = 0xff;
   *((R32 *)GPIO_PRT1_PC) = 0x00180000;
   *((R32 *)GPIO_PRT1_INTR_CFG) = 0;
   *((R32 *)GPIO_PRT1_PC2) = 0;

   /* Send the GPIO bit to the hardware pin */
   *((R32 *)HSIOM_PORT_SEL1) &= ~0x0f000000;
      /* Place your application code here. */
      if (count == 0x10000)
         *((R32 *)GPIO_PRT1_DR) = 0x00;
      if (count >= 0x20000)
         *((R32 *)GPIO_PRT1_DR) = 0xff;
         count = 0;

Looking at the above code, it is a grand total of five initialization statements, and then a simple loop with a counter changing the LED bit on and off.  Compiling the code, takes maybe two or three seconds.  It produces 100 bytes of object code.  Using the Creator tool, and a PWM to blink the LED on and off, it takes a little over twenty or thirty seconds to compile, and generates a couple K of functions that I may or may not use.  The long compile time is caused by auto generating code, routing and configuring clocks inside the chip, and running the FPGA style generator.  I’m not really using any of those resources currently, so it is simply a waste of my time.

The other issue is that Cypress has allowed their documentation to take a back seat to their code generation tool.  The documentation is not only poor, but it is incomplete.  I tried to find the address of certain registers, and it is not located in any of the documents.  I had to dig through their generated code to figure out the addresses of the registers.  I would say that their documentation is very poor, even below Microchip if that is possible.  Right now Freescale and ST Micro have very good and very complete documentation.  Microchip, and Cypress don’t compare.

They provide a bootloader.  That is fabulous especially if it is written well.  The only problem is that it seems to have been written by a first or second year co-op student.  The bootloader size should be as small as possible.  The Microchip bootloaders I wrote were either 768 or 1024 bytes depending on the processor.  The Freescale bootloader was 1024 bytes.  (It was actually about 530 bytes, but because of the flash protection scheme, code could only be protected in 512 byte blocks, so I had to move to two blocks).  Their bootloader component is 6400 bytes, or 1/5 of the 32K of code in the processor.  What the heck are they doing in there?  Maybe they are running the SETI program with the unused cycles.

A second issue with the bootloader is that it requires RAM for the interrupt vector table.  Admittedly, that is the easiest way to write a bootloader, but it means that 196 bytes of RAM are sucked up by that table.  A better, but more complex method is to create a jump table in the low flash, and move the exception vector table into the application code and not rely on .  While it will incur a couple extra clock cycles of delay during an interrupt, the RAM is preserved for use by the application code.  That would have been a show stopper with the old processor that only had 8K of Flash and 512 bytes of RAM.  Since this processor has 32K of Flash, and 4K or RAM, it is less of an issue.

Right now I don’t have a debugger.  I’m gonna grab a Pioneer board (costs about $25) which includes a debugger and a PSOC 4200 processor.  Having a debugger will really accelerate the speed of developing the code.

8/17/2015, Why have I never stopped at Flippers in Grandy, NC

Disclaimer:  No technical content on this blog entry, so if that what you seek, just skip over this one.

My family has been visiting the Outer Banks in NC since 1968.  To get there you drive down to Norfolk, VA, and then take little two lane roads until you reach the Outer Banks.  This year, the drive down from Boston was split into two days and spent the night in Norfolk.  Getting up in the morning, we got a jump on the traffic and I got a chance to stop at Flippers.

Wow, the arcade is fantastic!  They have both old and new machines.  This is where the NC state finals are held, and it is very easy to see why.  They probably have 50 or 60 pinball machines, and all of them are in exceptional condition.  The machines play like they should play.  They haven’t clearcoated all the playfields so that all the shots are easy.  The playfields are the original playfields, and they play how they did 30 years ago.  Spectacular.

Here were some of the highlights:

  • Humpty Dumpty machine that you can play for a nickel.  First flipper machine ever, and it just has to be given a try.  It almost seems like a nudge machine instead of a flipper machine.  It is very difficult to move the ball up the playfield and a flipper can only move it up one level.
  • Big Bang Bar.  Set at 50 cents per play.  Actually all the machines that I played were 50 cents a play.  I now understand why people call it a single shot game, but it was really fun to play it.  The callouts are not for the kids.
  • Cactus Canyon revisited.  Very fun game.  Seemed to have lots of different shots and a good amount of depth.
  • Monster Bash.  First Monster Bash that I have played that didn’t have major issues.  Tons of fun to shoot Frankenstein and electrify him.
  • Medievel Madness.  Always a fun play.
  • Tons of new Stern machines including Kiss, Walking Dead, Star Trek, etc.  By why play those, when you have so many classic titles to choose?  (Well one reason might be because they are only 50 cents a play)

I’m hoping to break away from the family and do one more trip up there to play for a couple more hours.

If you are going to the Outer Banks in NC, you will be driving past Flippers, and you should really stop.  Stop for a couple of minutes to play some really rare machines, or spend more time, and really get a chance to experience the best arcade that I’ve ever seen.

8/11/2015, Second Gen Boards arrived

Good gosh!  I am consistently amazed at how quickly boards can be produced in China and get shipped to the US even with slow Hong Kong post shipping.  The boards were ordered on 7/28/2015.  It is now 8/11/2015, so the boards were received in 2 weeks even choosing the slowest shipping option available.  (They did up their shipping rate by $3 which annoys me, but it continues to be a really good value for the money.

It is funny how when laying out boards, it is always zoomed into the board, so it seems like they are rather large.  Then when they are received, wow, they are really itty bitty boards.  Each wing board is approximately 1″ x 2″.  When eight of them are thrown down on a single PCB (i.e. the mashup which is in the repository), it is still only 4″ x 4″ of PCB (or < 10 cm x 10 cm).

Last night I cut out all the boards using the tile saw.  Next time, I might change the mashup layout a little bit to minimize the number of cuts that I have to make, but all in all, it worked out really well.  It took about an hour to cut out the 80 boards.  I soldered up two of each of the cards and I’m shipping them out to somebody who is interested in working on the embedded code for the next generation boards.

After looking at the PSoC 4200 in a little bit more depth, I’m not really that jazzed about their initially programmed bootloader.  It uses too many resources and does not seem “hardened” enough.  One of the lower priorities will be to rewrite that piece of code.

Here’s a quick picture of all the boards cut out:


Here are the boards that are populated.  Note:  When I tried to do the mashup, I lost the VLED voltage plane, so I had to add a wire.  I fixed that issue the day after I ordered the boards, so the Gerbers in the repository are correct.


Here is a mockup of the PSoC 4200 with the wing boards.  This guy supports 4 solenoids, 8 inputs, and 16 incandescent bulbs (Note:  Inputs don’t require a card because they are attached to the inputs of the processor.  You can see the connector soldered straight to the board):

Sol4, Inp8, Incand 16

Here is a mockup of the PSoC 4200 with the wing boards.  This guy supports 8 solenoids, 8 inputs, and 8 incandescent bulbs:

Sol 8, Inp 8, Incand 8

Last mockup with the wing boards.  This guy supports 4 solenoids, 16 inputs, and a SPI interface that can be used to talk to WS2812 chips (Neopixels):

Sol 4, Inp 16, SPI

I did some quick power measurements on SS3.  Power for SS3 in Attract mode 142W.  Power holding up both flippers 202W.  Base power for the four power supplies creating 48V is about 50 to 60W.  Taxi in attract mode is 110W.  Holding both flippers up is 140W.  I’m going to be removing 4 of the PC power supplies and replacing it with a single 36V supply.  That should reduce the power by a good amount.