Monday, April 29, 2013

Data logging...

Logging data from "Things", I've run computer operated planes, rockets, cars, helicopters. All of these share a similar problem. How do you log data?  I like to log ALL raw sensor readings unmodified.  This ends up in a steam of multi rate data, ie I see raw IMU readings at 200Hz, Compass readings at 20Hz  GPS at 20Hz, and various things like mode changes and responses at varying rates from 200Hz to to once at power up.

Some things I've learned:


  • I don't trust a file system. If your thing crashes the data you are really interested in is the last few bits of data before impact. I've lost too much data to trust a file system to be robust when the recording media is de-powered and spattered all over the landscape
  • You can't squeeze everything you want to record in a telemetry stream.
  • I keep reinventing this same wheel.
  • In the course of a project the format of what you are recording evolves. Keeping the code for the data display and the data recorder in sync is difficult and often I end up unable to read earlier logs.
My current approach is as follows:
 Record everything to flash directly, it makes no difference if its a SD card or dedicated flash chip I use the same format.  I fully erase the flash (all the sectors on SD) then on power up I scan the flash page or sector by sector looking for the first blank page/sector.  I then start recording new data in that sector....

My data recording has a big circular buffer that records the data in stream format....
All the recorded data is in blocks with the format:
START ID ... (Next START)


Where START is a specific byte and ID is a numeric value representing the kind of data...
If the value START appears in the data is is escaped...The packet ends with  the next start....

I then have a background process that takes this circular buffer and writes it to flash sectors where there is a full sector worth of data.


Now I might record something like my GPS reading...


typedef struct
{
 double lattitude; //Radians
 double longitude;//Radians
 double ht;  //m
 unsigned long msec;
 unsigned short week;
 double ECEF_X;//m
 double ECEF_Y;//m
 double ECEF_Z;//m
 float hspeed;//m/sec
 float head; //radians
 float vv; //m/sec
 unsigned char flag1;
 unsigned char flag2;
 unsigned char nsat;
 WORD seq;
}CombinedTrimGps;


So that would be stored as

START GPS_ID the contents of the structure.....

Now if in the past if I changed the record so I added say a list of  SNR reading to the record I'd have to redo the code that decodes the GPS data from the data logs....

In an ideal world C++ would have introspection and it would be able to put the format of the structure into the data log the first time its used.....  then when the data reader sees the data it immediately knows what format the data is in....

Since C++ does not have introspection... I wrote some code that comes very close...


void LogRecord(CombinedTrimGps & item)
{
  LogStart(LOG_TYPE_BD960 ,"GPS");
  LogElement(lattitude,"lattitude " );
  LogElement(longitude,"longitude " );
  LogElement(ht       ,"ht        " );
  LogElement(msec     ,"msec      " );
  LogElement(week     ,"week      " );
  LogElement(ECEF_X   ,"ECEF_X    " );
  LogElement(ECEF_Y   ,"ECEF_Y    " );
  LogElement(ECEF_Z   ,"ECEF_Z    " );
  LogElement(hspeed   ,"hspeed    " );
  LogElement(head     ,"head      " );
  LogElement(vv       ,"vv        " );
  LogElement(flag1    ,"flag1     " );
  LogElement(flag2    ,"flag2     " );
  LogElement(nsat     ,"nsat      " );
  LogEnd();
}

Using Macros this expands into...



void LogRecord(CombinedTrimGps & item)
{
static bool bIntroed;
BYTE tid=LOG_TYPE_BD960;
if(!bIntroed)
{
bIntroed=true;
StartItemIntro(tid,sizeof(item),"GPS");
ShowElement(tid,&item,&item.lattitude,item.lattitude,"lattitude " ) ;
.
.//All the elements
.
}
LogRawRecord(tid,(const unsigned char *)&item,sizeof(item));
}

I have a different ShowElement  overloaded function for each type I might want to log....


Thus the first time this data type is logged it puts the format into the data log record....
This is something I've wanted since I first did the rocket stuff 6 years ago.

I've now got a self describing log format that is complete and efficient....
I can also use the exact same format for Telemetry....

On my rockets I had an optional parameter that went to the logging function that selected if the data went to the LOG, to telemetry or to both locations....

Last night I got the whole chain to work and the picture below is derived from this chain of events...
IE I drove the AVC car around my driveway -> recorded log -> process log generating records -> put in excel -> Then made a graph of locations....

Now I have to get the data replay tool I wrote a few years ago to read this format and my data logging tools will be complete!







Friday, April 19, 2013

Finding a Firmware bug...

I spent the last few days hunting down a bug.
It was interesting enough that I thought I'd write a short blog post....

First some background...
In all of the NetBurner products we try very hard to make sure address space 0 is unmapped,
ie any attempt to reference a NULL pointer should cause an error.
The particular processor I was bug hunting on boots at address zero so in most normal scenarios address 0 is mapped to some kind of memory.

In our case this is not so...after the boot monitor boots, it turns off the hardware memory mapping to address 0...and relocates all valid memory up higher in the memory map...
  (We are not using an MMU)

The bug we were hunting was an interrupt latency bug, 99.99% of the time the part has an interrupt latency of 0.9 usec or so...  every once in a while it would have a longer latency, randomly varying from 1 to 520usec.   520usec is WAAAAAY too long.....

So to instrument this I setup one of the on chip timers to make a pulse and reset the counter at a fixed rate...
I hooked this timer output up to the non maskable interrupt and in the interrupt the first thing I did was read the timer counter value... this is an all on chip direct measurement of the interrupt latency...

And soon found that it varied randomly from 0.9 to 520usec.
Step 1 complete we can replicate the problem.....

After this I tried a whole bunch of things to hunt this down, looked at code, moved stacks and vector tables to/from different classes of memory... nothing changes the result....

If I make the whole code set small enough so that it all fits in the instruction cache it goes away....

As a random idea I changed the bus timeout monitor from  very long to something shorter...
and the latency improved... this was the clue I needed to hunt down the bug....

Here is the bug:

We use a lot of C code in the system of the form:


         if ( pDHCPTick )
         {  
             pDHCPTick();
 }

This code checks a function pointer to see if its null and calls the function  if it isn't...
This chunk of code is inside one of the system tasks and is used to service the DHCP client if its active...


This compiles to  assembly code something like:

moveal  pDHCPTick,%a0
testl %a0
beqw  skip
jsr @%a0
skip:

So if pDHCPTick is null the jsr @%a0 will never be executed....
But the processor I'm having the issue with has aggressive pipelining....
So it tries to load the @%a0 even if it does not call it. 

This causes an instruction fetch  of  address 0....
If address 0 is not in the instruction cache then this forces a cache miss and a cache line read from physical address 0.....

Physical address 0 is un-mapped so the bus timeout monitor goes off terminating the transaction 520usec later...
If the interrupt has the misfortune of trying to go off while we are hung in the bus time out we get long latency....

So the problem was solved by setting up the system so there is a valid bus ack at address 0....
Makes it so we don't catch NULL pointer accesses... but fixes the interrupt latency...

We are working on a better solution.. but I thought the problem crossed enough different hardware/software domains where it was interesting.

Wednesday, April 03, 2013

Teaching a computer to fly...


For this years spark fun AVC I'm planning to bring both a Car and a Plane. The Car is primarily last years car upgraded with a lower center of gravity and a much much better GPS.    The plane is a different story...
As part of the Lunar Lander Challenge  I built an autonomous helicopter to test  out my code.  Right now I have a vectored thrust rocket I'm flying as well, all in all these two efforts have the actuators and response of the vehicle fairly well coupled and tight.   a rocket or helicopter is a brute force thing that beats the environment into submission. An airplane is a gentle elegant creature. I 've been a pilot since I was a teenager, and when I look at most of the open or openly discussed UAV autopilots out in the world it strikes me that a lot of the people working on them don't understand this distinction.    

I have a copy of x-plane 10 with I'm using as a simulator for the autopilot development.
I've set this up as a hardware in the loop simulator and I've been slowly teaching the NetBurner to fly.
Much  like my Dad taught me to fly a long time ago.

  • Manage your airspeed, 
  • Coordinate your turns, 
  • Hold a heading, 
  • Hold an altitude,  
  • Make smooth level turns...
  • Turn to a heading...


Now learn to do all of the above while climbing, descending, accelerating, decelerating, you have a precision tool with the grace of a light saber, feel the grace.  If the engine dies it should still be in control when  glides gently to earth.....


 An airplane can be flown with a single axis rate gyro,  anyone that has passed a partial panel IFR checkride understands this.   I have all the rate gyro integrating quanterion attitude estimaters that everyone else has.... alas most of this is not needed you can do everything you need to do to control the airplane with airspeed , rate of turn, magnetic compass and altitude.

The interactions can be subtle... does the throttle control speed? does it control altitude? what variable does it control? How about energy.....  I found this paper to be especially good on this point.

The basic bottom level  control laws are simple....

Inputs:
Desired Airspeed
Desired Heading
Desired Altitude (or rate of climb)

Sensors:
Pitot AirSpeed (IAS)
Magnetic heading
Rate of turn.
Altitude (either baro or GPS driven)


The controls are simple...
The elevator is controlled with Airspeed  and TargetAirspeed
The rudder does nothing more than keep the "Ball" centered in flight or the aircraft on the runway on takeoff.
The Aileron is controlled by rate of turn, magnetic heading and target heading
The throttle is controlled with the current energy state  ie airspeed ^2 and altitude vs their target values.
(I've also found it useful to have a bit of feed forward from turn rate to throttle to add a little power in the turns)

In any case I'm having fun writing this....