I started the work on this series three months ago, which seems like both a very long time ago, and not very long ago at all. This blog series started out with a very simple use case. Can we use Grafana and InfluxDB to visualize metric data from an application? I was going to write a single blog entry about a small project. How I went from proving out a use case to writing a 6-part blog series that is almost entirely unrelated to the use case is still kind of a mystery to me – yet here we are. With the series winding down, it is time to talk about what I have learned during this project. When I pitched the idea for the series, I thought I would finish it in a couple of weeks, then go find something else to do. The first lesson is that everything takes longer than I think at the beginning.
In the last 3 months, I have built a dashboard to visualize the data from my cable modem 4 times, with 4 different tools. Each version of the dashboard was interesting. I learned more about each of the tools. This is good, learning about these tools was the point of all of this. I also learned a few new things about Docker and Python, which are excellent tools to have in the toolbox for all sorts of projects. Some of the snippets of code and patterns that I use in various projects got a little better and more resilient.
Being a little over-zealous (re. hopelessly optimistic) about my project timeline when I first built the script to capture the data, I did not spend time to figure out how to get the script to run completely in the background. A PowerShell script kicked off a process to launch Chrome, navigate to the management page for my modem, do some stuff, and then close. In total, the process takes about 5 seconds. But this was an interruption every 10 minutes, for 2 and a half months. I have 5 screens and 3 different computers on my desk, so this was not as detrimental as it might sound.
I have known that Selenium and chromedriver scripts could be run headless for a long time. I just never found the time to sit down and try it out for myself. I got bored one Saturday morning and it was starting to become a pain to manage the different scripts, so I decided to get it sorted out. The modification was only a couple of lines of code, which took one cup of coffee to complete. I use large coffee mugs, so this was maybe 90 minutes of real time. I spent about 15 minutes on the research, 2 minutes on the code change, 25 minutes getting side-tracked on the internet, 3 minutes testing the changes. I then got side-tracked again by my cat being adorable costing another 15 minutes. Once I got back to the keyboard, it took 30 minutes to commit and deploy the changes then verify that everything was still working. Had I spent the 20 minutes doing the work at the beginning of the project, I would have saved myself a lot of distraction time over the course of this project.
It is probably appropriate to write something about the products that I used for this project. I won’t rank the products. This is not a “Top 20 data visualization products of 2020” series. I dislike those sites and find them to be the Google search equivalent of Spam. Not email spam. I mean that stuff that comes in a can that smells bad and has questionable origins. You can eat it, and you probably won’t die, but should you? Comparisons between the tools can be made, of course, but it requires use cases and requirements, maybe even a statement of work if I am lucky. Sure, one product is going to do a certain thing better than another, but that only matters if that thing is part of your requirements. Dark user interfaces are hugely important to me. I have 5 screens and I don’t want a sunburn. Some people may prioritize performance over aesthetics. Other people might have priorities like scalability, security compliance, and (heaven help us) budget.
Each of the tools could have a home in an organization. Many of the tools are fully capable of solving just about any kind of data problem you could throw at it, assuming you start with some planning. InfluxDB is fantastic with metric data. Elastic is the search engine that drives a lot of technology out in the world. Splunk can do just about anything with any kind of text data that you can throw at it. Humio brings speed and efficiency to real-time data analytics. And all of them do so much more.
No matter which tool I was using, I ran into some sort of challenge along the way. These challenges were never show-stoppers, just things that were caused by something I did not know to plan for. This happens in every project. Whether it is a proof of concept for a product, a security exercise, or building some shelves in the garage, nothing ever goes exactly to plan. The trick to handling the challenge successfully is to incorporate the new facts to the design. With the updated design to guide the decisions, the next step is figure out how to take another step toward the goal. If you occasionally retcon your requirements and pretend that the thing you are going to do, which is very different than what you planned to do, was part of the plan then whole time – that is ok too, sometimes.
The last lesson that I learned has to do with time management and deadlines. I like to think I am at my best when I am unstructured, no rules, just a mad scientist with way too many screens doing fun stuff to see what happens. No matter how much I have fought against deadlines, they are unavoidable. They might even be necessary. Don’t tell any of the project managers I said that. Each post in this series took 1-2 days working with the tools to build the dashboard, and another day or so for the writing and editing. Like most people, I have several projects running at the same time. Without good time management, I might have run out of time to come up with funny bits about frozen vegetables, pumpkin cookies, and Spam. That would have been a travesty.
I have been telling myself for years that I would find time to do more writing. The writing did not get done until a schedule and a deadline was put in place. While I have been working on this series, I began a new project that may turn out to be work-related at some point. It will take me a year to complete it, but I started that project with a goal and a timeline. Combining the goal with a timeline increases the likelihood that it will be completed. Who knows, I might write another series about that project!
And finally, we have reached the end. Thank you for sticking with me, I hope you enjoyed this series. The next time you find yourself in need of help with a data analytics platform, get in touch with us and let’s talk about it. I promise we won’t talk about cable modem data, unless you bring it up.
This blog was written by Greg Porterfield, Senior Security Consultant at Set Solutions.