The Current Reality of Low-Cost Sensors: Scientific Data, Actionable Data, and Lies

The rise of low-cost sensors, like MEMS, optical, piezoelectric, chemical, and others, has made it possible to integrate sensing into everyday devices. These sensors, which once required bulky laboratory equipment, can now be found in commonplace objects like phones, watches, and even toasters. With the connectivity of these devices and widespread networking, a vast amount of data can be gathered and used.

The problem is that while it’s tempting to assume that your sensor is measuring what you want it to measure, that is so far from the truth that “all sensors lie” has become a common refrain among engineers. Measuring what you want to measure in a specific application requires a significant amount of filtering, analysis, and calibration to compensate for confounding factors, and even combining data from multiple sensors (sensor fusion) for accurate measurements. Successfully using sensors in any application involves a lot of trial and error.

Over the past decade, sensor companies have been adding “smarts” to their products by embedding some of the algorithms that previously required external software. While this is good in many cases, sometimes this hidden and undocumented processing can reduce accuracy. This is especially true for sensors that measure dynamic signals where sampling frequency and bandwidth are important.

Don’t get me wrong … it’s remarkable what tiny MEMS and other low-cost sensors are capable of, but they aren’t accurate enough for certain applications. Let’s look at why.

Good Enough Versus Laboratory-Grade Sensors

First, let’s look at the differences between your phone’s 32¢ motion sensor and the $4,000 accelerometer used in laboratory equipment. If you delve into the specifications, you’ll see significant differences in variables like sensitivity, resolution, accuracy, repeatability, signal-to-noise ratio, bandwidth, stability, drift, response time, ruggedness, environmental range, and calibration. All of these affect the measurements taken.

Aside from those differences, the $4,000 sensors also have NIST traceability. Most measurements taken for scientific purposes are required to be done with instruments that can be traced to defined standards. This assures that when the sensor measures a variable at X level, that reading is accurate. In the U.S., the standards body is the National Institute of Standards and Technology (NIST). NIST defines the measurement standard, and they often define appropriate approaches for calibrating sensors against that standard.

To ensure that measurements taken by scientific and medical equipment continue to be highly accurate, the equipment is calibrated against the reference standard regularly. Calibration can be very involved; an entire industry has developed around calibrating equipment. That kind of rigor is costly.

However, NIST traceability isn’t necessary for most everyday sensing. If the accelerometer in your watch misses a few steps, would you even notice? Low-cost sensors are good enough for low-stakes everyday applications. Where we run into issues is when low-cost sensors are used for applications where they are unable to measure accurately enough under all conditions—when they are no longer good enough.

Air quality monitoring is one area where low-cost sensors can result in large errors. For example, the popular PurpleAir sensor is well-documented as poorly calibrated and unable to compensate well for humidity. It also has poor particle size selectivity and is prone to large errors due to age and exposure. So, while the large network of PurpleAir monitors around the world does provide meaningful insight into general particulate levels in different areas—which we’ll get to in a moment—these measurements should not be trusted for critical decisions and any single measurement is not reliably meaningful.

Measured Versus Inferred Data

Things get more complicated when low-cost sensors are used with algorithms to provide information that they aren’t actually measuring. At best, the measurements are correlated, and at worst, they are inferred by the sensor data.

Respiration rate measured by smartwatches is one example. It is derived from the watch’s Photoplethysmography (PPG) signals, which detect other physiological parameters related to blood flow, like heart rate and oxygen saturation (Sp0₂). Respiration rates can be estimated from this data, but not measured.

Worse are “CO₂ equivalence levels” derived by some volatile organic compound (VOC) sensors. CO₂ equivalence is a way of expressing the potential environmental impact of detected VOCs in terms of equivalent CO₂ emissions (i.e., this level of VOCs is as bad for global warming as this much CO₂ would be). But VOC sensors don’t measure CO₂, rather they use assumptions about the gas being detected and algorithms with questionable math to infer what the equivalent CO₂ level would be if the assumptions are accurate.

While there may be strong practical correlations, the use of AI algorithms to infer something that isn’t directly related to what the device is sensing is dubious. Consider devices that claim to non-invasively measure blood glucose by optical, thermal, or other sensor techniques. There are observable correlations between blood glucose levels and many measurements that sensors can provide, but there is currently no accurate and reliable method to quantitatively measure blood glucose using non-invasive techniques. The consequences of putting too much trust in one of these sensors could literally be deadly.

Data Versus Meaning

Once you have your data, using that data in a meaningful way is the next challenge when working with low-cost sensors.

Beyond accurate data, there are challenges in interpretation to determine the best course of action to take. Fall detection is a good example: if you can sense the direction, acceleration, and rotation of a person in real-time with good accuracy, then you might think that you can detect a fall—maybe even early enough to do something to prevent injury. Imagine if your phone interpreted a sudden drop as a fall and automatically dialed 911—when in reality, you’d just dropped your phone. While this feature is common on phones and smart watches now, it is wrong far more often than it is right. Confirmation and delay of action are required to avoid false alarms, and both are actively being used to “train” the detection algorithms.

Real life in the real world is complicated and the consequences of false or missed activations are serious enough that it’s generally not feasible to determine the best course of action based on sensor data alone. Drawing that line is critical. Understanding all of the factors involved in making that decision—in the few milliseconds available—is the real design and engineering challenge. AI algorithms are being developed and trained to interpret sensor data because we just don’t have analytical approaches that work well enough.

Measuring something as simple as the temperature can be surprisingly challenging. While you can accurately measure the temperature of the sensor itself, more than likely you are interested in measuring air temperature or the temperature of some other specific point. The sensor is likely not exclusively influenced by the target point: heating from the circuit board, motors, batteries, and even the user’s body are common problems that make measuring temperature challenging.

There are a few exceptions. Front airbags use sensors to detect when to deploy, reducing the occupant’s impact in life-threatening crashes. They generally do not deploy for fender-benders, where the risk of injury from airbag deployment outweighs the risk of injuries from the impact. However, side impact airbags are an exception to that exception. They are supposed to deploy only in rollover situations, but they have deployed on bumpy roads under normal driving conditions—and this still occurs, rarely, after years of development and testing.

Seeing the Big (Data) Picture

While low-cost sensors may not be highly accurate and they’re certainly not calibrated to the reference standards necessary for critical measurements, they don’t really need to be for most applications. When those applications are looking at big-picture data rather than basing actions on single measurements, they are good enough. The Apple Watch might not count every step you take, but it’s correct within 1-2% for most people, which is good enough. While day-to-day measurements in the Apple Health app likely contain too much noise to be individually reliable, looking at how that data trends over longer time periods provides useful insights.

Similarly, a large quantity of low-cost air quality sensors do provide meaningful data, even when the sensors themselves have limited individual accuracy. On a geographic scale, the tens of thousands of deployed low-cost air quality monitors provide useful information on the air quality around the world. Those measurements—collected over time, over wide geographic spaces, and over thousands of units—can be transformed from imprecise data points into a meaningful and useful picture.

Sensors Use at Daedalus

Using sensors, even under simple circumstances, comes with a surprising number of challenges. When things get complex, the challenge of using sensors increases exponentially.

At Daedalus, we specialize in those challenging situations. We’ve developed a process that helps us to identify the best sensor for the application, and we’ve learned to rely on multiple sensors, using diverse technologies, and in different locations to ensure more accurate and reliable results, particularly when avoiding errors is critically important. We’ve used sensors to ensure fall safety compliance, pinpoint missing surgical sponges, detect drugs in the U.S. mail, locate underground pipes, non-invasively detect bilirubin, and to help ensure the safety of industrial workers from exposure to excessive noise, dust, and hazardous gases.