Finding the RFID Signal in the Noise

By Kurt Hozak and Olajumoke Awe

Use insights based on Nate Silver's The Signal and the Noise to make better use of RFID data analytics to improve processes.

Realizing the full potential of RFID often requires going beyond using it simply as a means of fast data collection in otherwise unchanged processes (a "paving the cowpaths" approach). Higher return on investment is possible if RFID is used to facilitate enhanced or entirely new processes (see RFID Research Supports Real-World Experimentation).

In some cases, RFID directly integrates into how an activity is performed, but it's also possible to use data analytics with RFID to indirectly make operations and marketing improvements. Using insights gleaned from the data, predictions can be made about anything from customer and employee behavior to supply chain events—which, in turn, can lead to changed processes (see The Emerging Marketplace for RFID Data Analytics, or Finding a Needle in a Haystack and A Guide to RFID Analytics Software for Retail and CPG Companies).

It has been said that compared to data-collection alternatives, RFID's unique capabilities and low variable costs for each additional read create a "fire hose" of big data. While one might infer that better decisions will naturally follow from having more data, taking advantage of it is not always as easy as one might hope. The award-winning and best-selling book The Signal and the Noise, by Nate Silver, makes many points that are relevant to RFID, though he does not specifically discuss it as a data source. He observes that while the amount of information available to us is greater than ever, we are not very good at making predictions, and "we face danger whenever information growth outpaces our understanding of how to process it" (p. 7). Silver's insights can be applied to RFID to avoid prediction pitfalls and, in turn, improve processes.

As RFID infrastructure develops, information sharing across supply chains becomes more common, and data analytics software and data aggregators become more capable, companies will have many more variables and data points that can be incorporated into their analyses and subsequent predictions. While that sounds like a good thing, Silver observes that when people have more information, they are prone to cherry-pick data that fits their biases and desired outcomes. Arguments that sound convincing can be made for any side of most issues, especially to those predisposed to believing a particular point of view. This is not always done consciously, so organizations and analysts need to work hard to prevent and detect it.

First-hand experience can be used to help verify that correlations in the data are causally related and not just statistical coincidences that are ever more likely to be observed simply because of the huge and growing amount of data collected. An organizational culture that takes advantage of bottom-up engagement may help identify opportunities for which RFID data analytics can truly make a difference.

In his article Free the People, RFID Journal editor Mark Roberti suggests letting workers who can benefit from RFID play significant roles in driving how it is used. This is compatible with lean principles that suggest that frontline workers often have the best vantage point from which to suggest operational improvements (see Lean and Six Sigma Create Valuable Synergies for RFID Adopters).

The lowest-level workers may not have the perspective to see insights from RFID data spanning outside their immediate purview (e.g., across departments and supply chains), but the point is that data collection and analytics should be organically driven from those who are in a position to know. If data analytics can help find needles in haystacks, but there are a multitude of haystacks in the barn of big data, companies should take a bottom-up approach to identify which haystacks are most likely to have the needles.

Real innovation and vital organizational buy-in are more likely from those whose deep knowledge of company processes lets them see RFID's potential (and also distinguish when ill-conceived projects are being promoted). Especially in lean cultures, such employees are naturally incentivized to try to employ RFID data and analytics in value-adding ways that benefit them and their company, and they are more likely to be successful because their theoretical understanding is grounded in their first-hand experience. In the parlance of Silver, they are more likely to pursue RFID projects that collect data that adds to the useful "signal" instead of the distracting and misleading "noise."

Although too much data can tempt one to cherry-pick from it, too little data can also be a problem. Silver calls overfitting "the most important scientific problem you've never heard of" (p. 163). It involves creating models that try to do too much with limited and noisy (highly variable) data. Assuming data for the right variables is being collected (e.g., because engaged employees have identified theoretically likely opportunities), then having more data (made possible by RFID) can help avoid being misled by the statistically dangerous combination of small sample sizes and high variability.

While it's important to understand the theoretical causal relationships between variables, Silver says, it's often a mistake to think that there are grand unifying theories behind phenomena that provide simple explanations. He observes that the worst prediction mistakes typically result because they are grounded in a desired story instead of reality, ignore difficult-to-measure risks despite their seriousness, are based on assumptions and approximations that are significantly less accurate than recognized, and avoid dealing with fundamental uncertainties.

Rather than using highly precise models that wrongly convey a high degree of accuracy, he suggests an empirical approach whereby a reliance on observation develops a tolerance for complexity and unpredictability. He points out that when more diverse data is collected, models will often not be able to account for everything and will therefore look less impressive, but that is a good thing because it helps make us aware of our lack of understanding and keeps us from being overconfident.

For those with the ability and desire to properly collect and analyze the data, RFID can create valuable perspective. Silver notes that when there is a major forecasting failure, it is often related to an "out of sample" condition that is vastly different than the data used to construct the model. Because RFID can help us collect data that was previously impractical to obtain, it can reduce the number of out of sample situations, greatly expand and diversify samples, and help us better understand certain types of risks and uncertainty (e.g., the frequency of rare but significant supply chain or healthcare events). Larger and more diverse data sets that are appropriately interpreted tell us more about the world as it really is and fill in gaps in our knowledge so that we are much less reliant on approximations, assumptions, and misleading models.

Data sharing across supply chains is especially helpful at uncovering unforeseen realities and complexities. Analysts and organizations need to carefully listen to what the data is telling them, even if it is not compatible with their expectations. Silver notes (p. 423), "The 9/ 11 Commission Report identified four types of systemic failures that contributed to our inability to appreciate the importance of these signals, including failures of policy, capabilities, and management. The most important category was failures of imagination. The signals just weren't consistent with our familiar hypotheses about how terrorists behaved, and they went in one ear and out the other without our really registering them."

RFID data can also be misused or under-used because of similar failures. Great technology can be wasted or even harmful if the right policies, capabilities, and management aren't also in place. Sometimes the data will be counter to what one would prefer to believe. For example, although Procter and Gamble may have hoped for more action from Wal-Mart based on information from tagged promotional displays, the data and the retailer's alleged response to it was undoubtedly informative (see Procter & Gamble Halts Tagging of Promotional Displays).

Silver believes that modeling approaches trying to make sense of data have historically been heavily reliant on nebulous assumptions that have led to terrible predictions. Noting our "naïve trust in models" (p. 11), he says (p. 15) that "the solution requires an attitudinal change. This attitude is embodied by something called Bayes's theorem…Bayes's theorem is nominally a mathematical formula. But it is really much more than that. It implies that we must think differently about our ideas— and how to test them. We must become more comfortable with probability and uncertainty. We must think more carefully about the assumptions and beliefs that we bring to a problem."

Rather than extrapolate far into the future with a static forecast, Silver suggests (p. 451-452), "This is perhaps the easiest Bayesian principle to apply: make a lot of forecasts. … Bayes's theorem says we should update our forecasts any time we are presented with new information. … Companies that really 'get' Big Data, like Google, aren't spending a lot of time in model land. They're running thousands of experiments every year and testing their ideas on real customers." RFID provides a steady stream of information that can be used to facilitate frequent experimentation and forecast updates. Just as lean favors quick response pull production over obsessing about the development of a perfect forecast, RFID data analytics with Bayes's theorem can facilitate better service at lower cost in response to dynamic operations and marketplace conditions.

Raw technical statistical skills are not enough for high quality results. Disciplined acumen is required to relate the variables to each other so that data analytics do not devolve into data dredging. As Silver describes it (p. 167), "The wide array of statistical methods available to researchers enables them to be no less fanciful—and no more scientific—than a child finding animal patterns in clouds." The sheer volume of data from RFID and other modern data sources provide zoos of animal patterns, but not all the data is as meaningful as it seems. With the prevalence of Big Data, Silver notes that some may ask (p. 197), "Who needs theory when you have so much information? But this is categorically the wrong attitude to take toward forecasting… Statistical inferences are much stronger when backed up by theory or at least some deeper thinking about their root causes."

Developing the organizational capabilities to better see and take advantage of RFID data signals amongst the noise requires leadership and commitment. Trust needs to be cultivated so that employees are confident that identifying opportunities and communicating forecast uncertainty will be well received. Training may be required at all levels to teach new ways of thinking and avoid common pitfalls.

It is not a coincidence that trust and training were also identified as the two key contextual factors for making lean work at Toyota (see Flexibility Versus Efficiency? A Case Study of Model Changeovers in the Toyota Production System). Of course, it does no good to analyze data if one cannot respond with effective operations and marketing. As with lean techniques, RFID and data analytics can be powerful tools, but like lean they work best when embedded in a synergistic system.

Kurt Hozak is an associate professor of operations management at Coastal Carolina University's E. Craig Wall Sr. College of Business Administration and an operations and technology management consultant. Olajumoke Awe is an assistant professor of operations management at Coastal Carolina University's E. Craig Wall Sr. College of Business Administration.