Josh Leverette's blog

Josh Leverette's picture

Sensor Data

Considering that I'm probably going to be spending large amounts of time processing data coming in over serial for the next little while, until the algorithms are correct, I am experimenting with switching over to using my Raspberry Pi 2 as my development machine for now. I'm done with the digital logic analyzer for now, which is the primary thing I won't be able to use from my Pi 2. The quad core processor in the Pi 2 is immensely more powerful and responsive than the original Pi, so it's a perfectly pleasant experience. I'm typing this blog post from the RPi2 even now. The primary holdup is that the gcc-arm-none-eabi target doesn't seem to be pre-compiled for the Pi 2 for whatever odd reasons, so I'm running a script that will download and compile it for the RPi2 now. If this works, that would be ideal. The performance on the Pi 2 is more than sufficient to compile the small code base that I have in real time, and I will be able to more directly interact with the streams of data coming over the serial connection.

It's also worth noting that I'm currently using a git submodule to connect lifereckoner to the LSM9DS0 driver, which is something I've heard mixed feelings about before in the past. So far it seems to be alright, but when I was setting things up on my Pi 2 I had to run git submodule init and then git submodule update before the LSM9DS0 folder was populated, if I recall correctly.

This is one sample output from my LSM9DS0 driver's demo, just for reference. The data looks about right, though it does have some decently significant noise in it that will need to be filtered out. When the demo ran, the sensor had been running for long enough to have filled up the FIFO queues (and begun overwriting values in a circular fashion) on the accelerometer and the gyro. The magnetometer does not have a FIFO queue, so there is only one Triple there.

ax: -0.102234 Gs ay: 0.012085 Gs az: 0.925354 Gs

ax: -0.098816 Gs ay: 0.014893 Gs az: 1.010620 Gs

ax: -0.097473 Gs ay: 0.004150 Gs az: 0.923035 Gs

ax: -0.100342 Gs ay: 0.010864 Gs az: 1.000183 Gs

ax: -0.091553 Gs ay: 0.002380 Gs az: 0.934631 Gs

ax: -0.109985 Gs ay: 0.022156 Gs az: 1.003662 Gs

ax: -0.080505 Gs ay: 0.006470 Gs az: 0.927673 Gs

ax: -0.111572 Gs ay: 0.021240 Gs az: 0.999084 Gs

ax: -0.078735 Gs ay: 0.005737 Gs az: 0.941956 Gs

ax: -0.108765 Gs ay: 0.014709 Gs az: 0.973938 Gs

ax: -0.124756 Gs ay: 0.009827 Gs az: 0.935791 Gs

ax: -0.114563 Gs ay: 0.017639 Gs az: 0.925171 Gs

ax: -0.078064 Gs ay: 0.011475 Gs az: 0.970825 Gs

ax: -0.111450 Gs ay: 0.014160 Gs az: 0.952148 Gs

ax: -0.084106 Gs ay: 0.011108 Gs az: 0.989502 Gs

ax: -0.103943 Gs ay: 0.010498 Gs az: 0.943420 Gs

ax: -0.089172 Gs ay: 0.010254 Gs az: 0.993286 Gs

ax: -0.101807 Gs ay: 0.009827 Gs az: 0.930481 Gs

ax: -0.096069 Gs ay: 0.012573 Gs az: 0.998047 Gs

ax: -0.093018 Gs ay: 0.002319 Gs az: 0.922363 Gs

ax: -0.100586 Gs ay: 0.015076 Gs az: 0.999634 Gs

ax: -0.086609 Gs ay: 0.007751 Gs az: 0.931091 Gs

ax: -0.105347 Gs ay: 0.016663 Gs az: 1.007507 Gs

ax: -0.081604 Gs ay: -0.000122 Gs az: 0.932251 Gs

ax: -0.108093 Gs ay: 0.015442 Gs az: 0.994263 Gs

ax: -0.083801 Gs ay: 0.004333 Gs az: 0.939697 Gs

ax: -0.117920 Gs ay: 0.021729 Gs az: 0.982178 Gs

ax: -0.074707 Gs ay: 0.007202 Gs az: 0.949768 Gs

ax: -0.101929 Gs ay: 0.018738 Gs az: 0.972046 Gs

ax: -0.078613 Gs ay: 0.007690 Gs az: 0.970276 Gs

ax: -0.050598 Gs ay: -0.019531 Gs az: 0.510864 Gs

mx: -0.086426 Gauss my: 0.022949 Gauss mz: 0.163818 Gauss

gx: -1.495361 dps gy: -0.991821 dps gz: 0.213623 dps

gx: -1.327515 dps gy: -0.617981 dps gz: 0.228882 dps

gx: -0.839233 dps gy: -0.450134 dps gz: 0.717163 dps

gx: -1.678467 dps gy: -0.656128 dps gz: 0.099182 dps

gx: -1.899719 dps gy: -0.404358 dps gz: 0.038147 dps

gx: -1.518250 dps gy: -1.068115 dps gz: 0.221252 dps

gx: -2.616882 dps gy: -1.022339 dps gz: 0.274658 dps

gx: -2.037048 dps gy: -0.343323 dps gz: 0.244141 dps

gx: -1.647949 dps gy: -0.259399 dps gz: 0.419617 dps

gx: -1.182556 dps gy: -0.076294 dps gz: 0.305176 dps

gx: -1.052856 dps gy: -0.640869 dps gz: 0.740051 dps

gx: -1.174927 dps gy: -0.114441 dps gz: 0.885010 dps

gx: -0.617981 dps gy: -0.045776 dps gz: 0.495911 dps

gx: -0.534058 dps gy: 0.152588 dps gz: 0.221252 dps

gx: -1.022339 dps gy: -0.228882 dps gz: 0.282288 dps

gx: -0.495911 dps gy: -0.213623 dps gz: 1.022339 dps

gx: -0.640869 dps gy: -0.244141 dps gz: 0.686646 dps

gx: -0.640869 dps gy: -0.244141 dps gz: 0.686646 dps

gx: -0.640869 dps gy: -0.244141 dps gz: 0.686646 dps

gx: -0.640869 dps gy: -0.244141 dps gz: 0.686646 dps

gx: -0.640869 dps gy: -0.244141 dps gz: 0.686646 dps

gx: -0.640869 dps gy: -0.244141 dps gz: 0.686646 dps

gx: -0.640869 dps gy: -0.244141 dps gz: 0.686646 dps

gx: -1.098633 dps gy: -1.205444 dps gz: 0.160217 dps

gx: -1.098633 dps gy: -1.205444 dps gz: 0.160217 dps

gx: -1.098633 dps gy: -1.205444 dps gz: 0.160217 dps

gx: -1.098633 dps gy: -1.205444 dps gz: 0.160217 dps

gx: -1.098633 dps gy: -1.205444 dps gz: 0.160217 dps

gx: -1.098633 dps gy: -1.205444 dps gz: 0.160217 dps

gx: -1.098633 dps gy: -1.205444 dps gz: 0.160217 dps

gx: -1.098633 dps gy: -1.205444 dps gz: 0.160217 dps

Josh Leverette's picture

LSM9DS0 Driver is complete.

Today, I fixed the issue with the gyro not returning any data (I forgot to enable the axes, I was only enabling the sensor itself), I implemented the dynamic scale setting functions for the different sensors inside the LSM9DS0, and I wrote some demo code to print the raw data that was coming back from the LSM9DS0 over the serial port.

Interesting thing: Apparently my Mac mini is utterly incapable of handling baud rates higher than 9600bps. I was convinced the issue was my microcontroller for a solid 30 minutes, then I decided to hook up my DLA to the Serial port RX/TX pins. It is rather ridiculous to me that I'm having to scope a standard serial port connection. The DLA showed perfectly correct data flowing at every bps, including my desired 115200bps setting. I spent a couple of hours tonight trying a thousand different things to get my Mac mini to read anything but 9600bps, and I had zero success. Everything from using screen to downloading a Mac app called Serial to writing my own Serial reader using the Python pyserial module. Eventually, just to ensure fairness in the testing and to be certain that there wasn't some arbitrary limitation on the STM32 Nucleo, I booted up my Raspberry Pi 2 and connected the STM32 Nucleo to it. Using minicom (which I also tried on Mac) I was able to read 115200bps data streams perfectly.

If anyone has advice on how to get my Mac mini to behave better in regards to serial data, I would appreciate it. As it is, it doesn't seem to be an issue with my STM32 Nucleo, although I have (historically) been able to read higher bps data from an Arduino on my Mac, so I don't know for sure where the root cause of the issue is.

But, in regards to the LSM9DS0, the driver is complete! I was printing data that had been converted into the proper units from the raw 16-bit values over the serial port, and the data looked just fine, approximately the values I had expected to see. I'm planning to do more detailed analyses tomorrow as I build out the lifereckoner, as I've decided to call my dead reckoning system for the time being.

Code for lifereckoner is, of course, hosted on GitHub, as is the code for the LSM9DS0 driver.

Josh Leverette's picture

Enabling Magnetometer and cleaning up the LSM9DS0 driver

Tonight, I enabled the magnetometer part of the driver, and it seems to be collecting data from the correct registers. A very interesting development was the realization that the magnetometer actually does not have a FIFO buffer. I suppose the designers decided that the 100Hz ODR wasn't fast enough to warrant a FIFO, which is reasonable, but it is interesting that it merely seems implicit in the LSM9DS0 manual. There is no explicit statement that I've seen that says "Only accelerometer and gyro have FIFO buffers and the magnetometer does not" or any such thing.

In the process of enabling the magnetometer driver, I realized that I had created a massive memory leak in the way that I was handling data coming back from the readFromRegister function. Namely, the microcontroller was running out of memory and crashing. It only took me a minute to realize what the problem was. I decided to modify the readFromRegister function to simply return a single unsigned char, rather than a pointer to an array of them, as collecting an array of values is a feature that wasn't being used anywhere, as explained next.

Earlier tonight, I extended the readAccel function to return a vector of Triples, and I disposed of the Option type altogether since it was no longer needed as the LSM9DS0's behavior became more clear. As it stands now, it is reading the values for each X, Y, and Z before pushing that triple into the (pre-allocated//reserved) vector. If I decide to support chaining in the future, reading the values in this order should make it a drop-in feature, due to the way that I have the readFromRegister function configured. Until I can do analysis on the data being pulled from the LSM9DS0 and see that it does actually make sense, I don't want to introduce chaining, since that result, while it should be easily predictable, is harder to verify until then.

Interestingly, tonight I took advantage of the opportunity to use do...while loops in my code! Not once in nearly 10 years of programming can I recall a do...while loop actually being preferable to a regular while loop, so that was certainly interesting. As an example, this is what the do...while loop looked like in the readMagneto function:

//wait for data
char status_reg;
bool hasData;
do {
    status_reg = readFromRegister(ACCMAG, REG_STATUS_M);
    hasData = ((status_reg & NEW_DATA_M) == 0x0F);
} while (!hasData);

Previously, I had been duplicating the logic to right before the loop and inside the loop. Do...while enables closer adherence to the DRY principle in the case of busy waiting under certain circumstances.

I also have implemented the readGyro function, but it is currently stuck in the busy-wait loop because the Gyro FIFO buffer is claiming to be empty permanently. This indicates that I haven't fully activated the gyro module, which I'll have to investigate next.

After the gyro module is activated, I might implement the setScale function to dynamically alter the sensor scales, such as the maximum gravities that the accelerometer can sense, but that should be trivial. Beyond that, I am calling this LSM9DS0 driver feature complete, at least as far as a minimum-viable driver goes.

Josh Leverette's picture

SPI success!

So, tonight I spent a few minutes to rebuild my circuit based on an understanding I reached when reading the MCU's documentation: there is actually only one SPI module on this particular chip. The LSM9DS0 relies on communication through two separate SPI channels to communicate with both the gyro and the accelerometer + magnetometer, so I had created two separate SPI objects, and wired two SPI connections as fully as possible.

It appears that there was resource contention that created the mysterious read issues I was having. My code is now running on a single SPI object, with only separate CS lines to control which chip is being spoken with, and now everything works perfectly!

Currently, it appears that reading a triple of data takes about 48us based on a measurement with my DLA, which means that I can theoretically read about 20,000 triples per second, where a triple is a complete set of data X, Y, and Z. Each axis is a 16-bit value. All three sensors at maximum ODR can generate 2660 triples per second (1600 for Accel, 960 for Gyro, and 100 for Magnetometer). This leaves about 86% of the CPU free to do other work, if utilized properly.

There is plenty of room for optimization through DMA, which reduces CPU involvement, or chaining reads together, which reduces protocol overhead by a significant margin. The LSM9DS0 supports watermark interrupts. Each axis of each sensor supports a buffer up to (IIRC) 32 elements deep; you can set a watermark at say, 24, to trigger an interrupt on the microcontroller. The microcontroller can then chain 24 reads together on each value. A normal read will write the target register's address, then read the value. A chained read will write the target register's address, then read however many values are desired. This reduces the bytes transferred on the SPI bus from a 48 bytes to read 24 down to 25 bytes to read 24, which is a significant savings.

Josh Leverette's picture

SPI deep dive

I've been reading deeply into the mbed code that controls SPI, particularly the platform-specific implementation underneath the HAL. It appears to incur a rather significant amount of overhead, overhead that was noted in the previous blog post. However, that overhead is fine for now -- what isn't fine is that I'm still not able to successfully read values from the LSM9DS0 over SPI.

I compared what the mbed code was doing to the STM32F334 Reference Manual, and it seemed to be functionally correct, yet I'm still not getting correct values. I'm always getting 0xFF, no matter what. Even if I hook the MISO line directly to ground or 5 volts, I always get 0xFF. The DLA is seeing the expected value returned from the LSM9DS0, but the STM32 is not seeing it at all.

I need to test the minimal example given for mbed SPI and see if it works, but I've already cut my usage of SPI down to the bare minimum to see where it stops working, and it seems that even my minimal case doesn't work.

The one bit of good news is I've done further testing, and the SPI hardware seems to work fine at 8MHz, so that's the frequency I'm running at now.

But, the mbed code is doing everything by the book. Waiting on the TX buffer to be free, writing the byte to be TXed, waiting for the RX buffer to receive a value, and then returning it. I have a few minor additional tests I need to run, but this is getting ridiculous. If writing to SPI is this easy, reading should work just as easily, and yet nothing I do makes any difference. The result is always 0xFF. Maybe my board is faulty, but nothing my multimeter has touched indicates a shorted solder joint or anything else abnormal.

It would be more work, but maybe I could use DMA (direct memory access) to do a semi-hardware bit-bang implementation of SPI through GPIO that would still be faster than I2C. Supposedly the GPIO pins can toggle every 2 clock cycles, which works out to be 36MHz, but this is a last ditch option.

Or maybe I could go back to I2C and try doing sensor fusion at reduced data rates. The research I've done indicated that the more data I can gather, the less drift my sensor fusion will experience, if I recall correctly, which is why I would like to get the maximum ODRs (output data rates) for all three 3-axis sensors in the LSM9DS0. After that, I could also experiment with lower ODRs to compare and contrast.

Maybe I should post on the mbed or ST forums and see if anyone there knows what's going on, but Googling didn't get me any closer to a solution either.

Josh Leverette's picture

SPI support

So, the good news is that SPI writing works perfectly, as far as I can tell with my digital logic analyzer (DLA). Unfortunately, the method for reading SPI does not seem to be working like mbed dictates that it should. The DLA shows the read commands working perfectly -- the LSM9DS0 returns exactly the expected values, whether I wrote them to the device or I'm reading from the WHO_AM_I register that has a preloaded and pre-known value on the LSM9DS0. On the mbed platform, there is no SPI->read command. SPI->write is supposed to return whatever value was seen on MISO. Unfortunately, mbed's SPI->write appears to be returning 0xFF always. This is undesirable behavior, suffice it to say.

I'll have to investigate this behavior in more detail. Once it starts working, my driver will be running on SPI. There are some unexpected but entirely consistent performance hiccups that will drive my SPI performance down, but I've pored through the generated assembly looking for an explanation and I've come away with none. I'm seeing 1.72us delays between contiguous SPI->write commands, which works out to 124 clock cycles @ 72MHz. The assembly between each long-branch even with -O0 (zero optimization) is only around 8 to 10 instructions -- nowhere near enough to occupy that much time, from what I can see. Perhaps their write function has a lot of in-function overhead?

Another interesting thing that requires further investigation is that the SPI clock signal became unstable when driven at the full 10MHz that the LSM9DS0 supports. Some initial investigation I did a few days ago indicated to me that the STM32 Nucleo that I have only supports up to a 7MHz SPI clock, but I could be remembering incorrectly. I'll have to look and see that I actually am driving at the maximum clock supported, but the signal became predictable when I turned it down to 7MHz. Regardless, 7MHz should be fast enough to read all of the data out of the LSM9DS0 in a timely fashion, although I haven't run the calculations to see how the additional 1.72us delay will impact that ability.

I also improved the readFromRegister function to support reading multiple bytes one after the other, which will reduce protocol overhead, but this functionality is untested as yet, since I can't get the SPI->write function to return the value that the DLA clearly shows the LSM9DS0 is transmitting.

Josh Leverette's picture

Switching to SPI

The code is currently in the midst of being rewritten to use SPI. The current revision on GitHub will not compile, but it shows some of the code being written, apart from the research also being done to switch over.

I expect to be done with this work in short order. With I2C, there were device addresses as well as register addresses. However, SPI does not do this. SPI relies on a CS (chip select) line for each device. In this case, the accelerometer/magnetometer (XM) have their own CS line separate from the gyroscope (G). XM also have a separate Serial Data Out (SDO) from the gyroscope.

The only thing I'm really confused on at this point is SDI, serial data in. Based on the pattern that is demonstrated by the LSM9DS0, I would anticipate that they would be separate, but as far as I can tell they are unified in a single pin. It is conceivable that you could, therefore, start two separate read commands and read them simultaneously in some fasion, but I really don't see the point yet. It would make more sense to me for there to only be separate CS lines.

But, mbed's basic SPI commands seem basic enough. I'd like to set up full, non-blocking transfers, but I haven't found a single example of mbed's "transfer" functions being used anywhere. The callback function signature seems particularly poorly defined. Using just the SPI "write" function, I think I can have SPI operational by the end of tomorrow, unless I've missed something big.

Josh Leverette's picture

Accelerometer success!

My driver is now reading what look like sane values from the 3-axis accelerometer! I haven't implemented the functions to read the gyro or the magnetometer yet.

Using my Digital Logic Analyzer (DLA), it appears that I2C might not be fast enough for my purposes, and according to the datasheet, it looks like the SPI connection is 25x faster. Running I2C at 400kHz, it takes 0.2ms to read a single axis of acceleration data, so 0.6ms to read all three axes. This means that only about 1600 values can be read each second... and that's actually the rate that I'm running just the accelerometer at! The magnetometer is running at 100Hz, and the gyro is running at 960Hz, so all sensors are running at their own maximum ODRs (output data rates).

There are techniques for batching reads, so I might look into that, but I designed my driver to be as interface independent as possible, abstracted by a function that reads a register from the LSM9DS0 and a function that writes to one, so switching to SPI should be simple.

I also learned a lesson about trying to write clever macros when you're tired. It was causing duplicate reads, which was very confusing for a couple of minutes.

Josh Leverette's picture

Code comments, redesign, and progress

At this point today, I've added significantly more comments explaining why the code does what it does, I've redesigned the API I'm exposing, along with some internal architectural decisions surrounding unsigned types and optional return values. I also made a fair bit of progress on implementing the function to read accelerometer values.

It is nearing midnight local time, so I figured I would go ahead and post this update, but as of now, I think I'll keep working once I've published this update. Several commits have been pushed up to GitHub already.

Josh Leverette's picture

Day of Scuba Diving

As noted in yesterday's blog post, today was entirely filled with checkout dives for my open water PADI SCUBA certification that I nearly have completed. We went to a local quarry to perform the dives. We did three checkout dives at the location that was nearly 2 hours from my apartment, one way, or 4 hours round trip. I left my apartment at 6:45am CST and I returned less than 30 minutes ago.

We were supposed to complete our checkout dives tomorrow, but the instructor had something come up, so the final checkout dive will be sometime next week. It looks like I'll be able to pick up working on the LSM9DS0 driver tomorrow, which will be nice.

Syndicate content