This is the final installment on the series of blogs I wrote on the continued use of flat files and why they are no longer viable for use in the future. The first installment focused on flat files and why embedded software application developers readily adopted them. Then in the second installment, I discussed why embedded developers are reluctant to use databases. The third installment looked at why developers cling to the flat file systems.
For this final installment, I realized that the argument for migrating off flat files probably needs to be done in a more prescriptive fashion. So along with our head of engineering, Desmond Tan, we’ve developed a list of reasons that aren’t really about what you’re currently using or what you should use in the abstract. But instead, focus on what in your actual requirements would justify the switch away from flat files. This list is a function of where we see use cases evolving for embedded Edge data management in the mobile, IoT, and complex intelligent equipment spaces. It also reflects what our customers are telling us in private meetings with respect to their challenges and business and product requirements.
Based on this input, here are the top eight requirements for modern edge data management:
1. Local data persistence and its changing use
Generally, data stored locally in flat files was needed as a cache for immediate operations on data exclusive to the device in question. Furthermore, that data was evaluated without a long-standing yet continually refreshed baseline. However, with the need to combine data from multiple devices and data sources, and the use of historical data as a pattern baseline for everything from simple signal-to-noise ratio to machine learning inference means the need for data management of complex operations that feeds local data processing and analytics outstrips the bare-bones functionality provided by file systems.
2. Support for multiple OS environments with a single storage format
Across product designs, operational technology portfolios, and Enterprise IT support of lines of business, a range of systems at the Edge and back into the enterprise must share data across OSes from Android/iOS to Windows/Linux to Cloud virtualized environments. A single storage format would significantly reduce integration time and cost as well as improve security.
3. Data management support for all roles involved with data processing and analytics
Different roles will need to leverage access to the Edge data management platform remotely and locally on this range of underlying platforms from different access mechanisms, including CLI, programming and scripting languages. There’s probably a much needed, longer discussion of why that deserves a separate blog.
4. Handle Big Data at the Edge
Pop quiz, what are the four Vs. for Big Data – yes, you can all name them if woken from deep sleep (for those of you who refuse to leave that wonderful dream: Volume, Variety, Velocity, and Value). Just as Hadoop was a first step at the Enterprise level to address the need for a common reservoir for Big Data, there needs to be an equivalent at the Edge. And, as was the case for the last requirement, a full blog is needed to discuss further separation from flat files. For now, the key tangible takeaway is that you need to be able to support varied types of data, including JSON, BLOB, and traditional structured data in a single database.
5. Share data across OT, Cloud and IT traditional environments
Data sharing must be able to handle device to device and device to gateway scenarios in OT environments as well as Edge to Cloud and Cloud to Data Center for remote and branch environments at the Edge to cloud interface. In other words, you need a single platform for Client, Peer-to-Peer, Client-Server, and Internet/Intranet architectures.
6. Operate stand-alone during periods of dis-connectivity
While the Edge may be overall moving into a state of hyper-connectivity, this should not be mistakenly seen as removing or reducing the need for the ability to process and analyze locally. In many operational technology applications, a web-only approach is unworkable. For example, if you have autonomous vehicles, processing, and making decisions based on video and lidar signals to determine where to steer the car, at what speed and acceleration – or deceleration cannot be handled by the cloud, latency alone would ensure accidents. In other cases, convenience or ease-of-use may be the reason you choose to handle operations locally. Say you have thousands of albums locally in your iTunes collection, and you want to search for a song by its lyrics on your fourteen-hour flight from San Francisco to New Delhi.
7. Handle high-speed, multi-channel data collection
Many of the core signals collected won’t change as we move to a more connected and automated state; pressure, volume, and temperature are three great examples of this. However, as referenced above, video streams for everything from facial recognition to machine vision are becoming more prevalent and at far higher resolutions and frame rates – think 4K UHD at 120 Hz. Also referenced above are lidar signals. At one time, I was an engineer and I built laser radar systems. Even in the dark ages, I could easily collect 400 MB of data per day, making decisions in milliseconds on each set of lidar signals collected. I could go one with example after example. The point is, with modern edge data processing and analysis, running into the multiple terabytes per day where the data is both processed immediately but also retrieve in part or wholly for additional processing at a later time – all of these scenarios will be commonplace.
8. Leveraging complementary off-the-shelf ecosystem components
It goes without saying that those using flat files are probably do-it-yourself types on other parts of their solutions as well. Moving from flat files will optimize productivity by opening up their use of plug-in-play options for advanced analytics, reporting and visualization tools, and platforms.
In summary, if you are seeing one or more of these requirements you really should consider migration from flat files over to a single, scalable, secure architecture that can handle multi-platform deployment and development and provides you with the horsepower to support a myriad of advanced data processing and analytics use cases. Flat files just won’t do the trick any longer my friend.