Modern Data - A Rant About How Businesses Store Data

Modern Data - a rant about how businesses store data

It’s fair to say that I’m a bit of a geek. I enjoy technology, I enjoy tweaking it, and making the most of it. I’m one of the unfortunate to still work for a huge “enterprise” full time and along with that comes my frustrations, this is slightly a rant with hopefully some educational guidance to help move us as a society forward in the future

The Problem

I think the problem stems from those 90s/2000s college classes teaching people computers. Back then those classes were all the rage for grandparents. I can remember my grandmother going to our local community college where they were teaching her Excel and Word. Truthfully, I blame Microsoft for a majority of my problem today.

The problem is we store data in the most inaccessible, messiest, craptastic ways! Take for instance an issues I’m currently facing. Every few years my employer is inspected by a higher portion of our organization. Higher produces a checklist which includes a list of questions. The questions are broken out into sections and subsections. Each question is then in one font color, with the rules regarding that question in a different color, and then a deeper explanation in yet another color. To format this word document in Word, they’ve got to insert a return in each play they want to create a new line.

Now, one of the best well known ways to prepare for this inspection (in my organziation) is to print each question, on one page, and insert the documetnation (training records, reports, etc) behind that page.

So that leaves us managers spending the better part of a week trying to reformat this 95 page document for each of sections so that we can properly display and print the questions we need in a usable and readable way.

You might be thinking I’m spliting hairs here, I’ve got plenty more examples of how we’ve screwed up data.

The solution

There are a multitude of ways this can be made MUCH easier to work with and track.

For starters a simple python app could contain each question in a web app, Create a binder for you to print to prepare, and then enable you to do self assessments right in the app. Just to clarify, you’ve also got to filter the questions manually based on your specific type of operation/organization, MANUALLY because it’s all in Word and formatted with a bunch of those returns and pretty colors.

If that was out of reach (I get not everyone enjoys programming as a hobby)… a CSV file could be used, put each question number in the CSV (or even Excel) with a list of what it applies to (that operation above) and then columns for the remainder of the info. Mail merge (python or Word) to print how you need it. Add another column to the end with a check to calculate if you think you’re prepared for that question, and even link to the documents required to comply with the question.

Create the whole document, as is, in text. Markdown might be preferred to me but just regular text works. YAML could be used to organize the data into chunks that can be processed, but that might get annoying with such a large number of questions. With text, we can check the document into version control systems and now suddenly, when this whole thing is updated we can see the changes instantly instead of having the hunt through the whole damn thing to find them!

Even better, a wiki! Pick up an open source Wiki software. Drop each question into the Wiki, cut a copy for use locally. Now we can forgo the paper altogether! Each organization pulls the wiki down, adds their documentation to comply with the question to the article and they’re done. Changes can be imported and then viewed in the log to see question changes.

Summary

I know this was mainly a grip session but this is a real problem facing our society. We are a digital sociaty but we literally use technology like it’s still paper. It’s not, digital data, stored correctly can be manipulated how we need it. Display what we need, when we need it, how we need it.

I think I might start a series on the time we waste doing this and headaches it causes. My wife has to manually create a new spreadsheet daily showing what has shipped from her company. It takes her about an hour to do but it could be done more efficiently and accurately, without error if someone just understood that the data being in one place, and a report being run automatically is stupid simple.

I think it’s time we start teaching all those grannies how to really use computers, including how to store data in more usable ways. Starting with ditching proprietary, none text based formats that can’t be checked into version control.