Proactive IT Maintenance to Minimize Downtime
No one needs to explain to you the virtues of being proactive. Of course, no one can make you (or this writer) do it either. You may know that you should change the timing belt on your car every 60,000 miles or so, but it’s even easier to do nothing about it -- until your belt brakes and either damages your engine or prevents it from running. (This was written from experience.) The next catastrophe could be just around the corner, but if you prepare for it, you might be able to avoid it altogether. If you don’t have a robust proactive maintenance program for your IT environment, it may just be (in the vernacular) an “accident waiting to happen”.
What do we mean by maintenance, and how many modes of maintenance are there? In the IT field, you hear about two: proactive vs. reactive maintenance. But some prefer to recognize four modes of maintaining systems: reactive, preventive, predictive, and proactive. Reactive maintenance is basically a “run-to-fail” strategy. Just keep the equipment or system going until it breaks down. Elizabeth Mazenko of the website Better Buys offers a good description of these four modes, which we borrow and summarize below. In this case, Mazenko considers both preventive and predictive modes to be part of a proactive approach.
Minimal staff required
Cost for repairs
Increases asset life span
Doesn't consider asset wear
Full visibility of assets
Need for monitoring equipment
The Virtues of Proactive Maintenance
Even though it’s not necessary, out of due diligence we should detail here some of the benefits of proactive maintenance in IT operations. Steven Saslow of the Information Technology Group writes about “The Benefits of Being Proactive in Information Technology”. His “key takeaways” include the benefit of avoiding failure entirely. If your goal is not to fail, proactivity is a pretty important concept to keep in mind. Saslow writes that most failures in an organization can be easily avoided. And he says that proactivity increases productivity and decreases costs.
Who doesn’t want to avoid failure? There are plenty of old adages that support the idea that proactivity is a truism -- something that’s obviously true. The Boys Scouts motto is “Be Prepared”. Proactivity is contrary to procrastination. You know what they say: “A stitch in time saves nine.” And how about this one: “An ounce of prevention is worth a pound of cure.” You get the point.
Anyone who persists in inadequate preparation for possible IT failures has no excuse, really. But we won’t go on preaching about virtue. Each person must decide their approach to maintenance for themselves.
Proactivity and Asset Management
Supposing you are convinced that you should be proactive with your IT environment, the question then arises: “What needs maintenance?” It would be too simplistic to answer “everything”. Due diligence in IT management requires a comprehensive accounting for all IT assets and a plan for maintaining them. The thing about data centers is that over time they can become a jumbled mess. Consider the spaghetti-like cables that sprawl beneath raised floors. And how about all the big and small devices in all the racks and cabinets and closets and offices throughout the company’s campus? And what about remote locations?
In the preparation for an effective proactive maintenance plan, you may need to refresh your asset inventory. One way to do that is with software that auto-discovers all existing elements on the network. PCMag has put together a comparison of “The Βest Asset Management Software of 2018”. You can research them further on your own:
- ManageEngine AssetExplorer
- MMSoft Pulseway
- Asset Panda
- LANDesk IT Asset Management Suite
- SolarWinds Web Help Desk
- BMC Track-It!
- InvGate Assets
This writer once worked at a headquarters for a grocery chain to scan and identify all the IT assets in their data center. It was a lengthy project, and we worked with SolarWinds. Cisco devices also have an auto-discover function to get information about other SNMP-based devices. And the free online software Spiceworks will also automatically discover devices on a network.
Once the device data is collected, the next step would be managing and manipulating the data. Network elements can be put into databases that schedule, monitor, and track IT maintenance activities. These functions can also be integrated into a sound change control process.
Preventive vs. Predictive Maintenance
We said that preventive and predictive maintenance are two different types of proactive monitoring approach. But what is the difference between them? We could summarize by saying that preventive maintenance follows a set time schedule, while predictive maintenance is based on the anticipated conditions of the assets being maintained. Let’s look to our sources for clarification.
Mazenko gives definitions for both types. “Preventive maintenance involves regularly performed, planned tasks that are scheduled based on either time passed or meter triggers.” And according to an article on “reliability-centered maintenance” by Better Buys, “Predictive maintenance relies on conducting maintenance based on trends within equipment data.” Others have slightly different definitions.
An article from Core Systems on the question explains the difference using automobile maintenance as an analogy. Suppose your vehicle manufacturer calls for an oil change every three months or every 3,000 miles. If you don’t drive the car so much and clock fewer than 3,000 miles at the end of three months, you may then take your car in for its three-month oil change anyway. This is preventive maintenance. But if you are putting on a lot of miles, reaching the 3,000 miles in less than three months, the maintenance type would be considered predictive maintenance. That is, the manufacturer predicts that any car hitting 3,000 miles will have degraded its oil so much that it needs to be changed to maintain a healthy engine.
In the IT world, preventive maintenance would be regularly scheduled work, such as monthly or annual checks. Predictive maintenance would depend on the real-world conditions of the hardware or software. A printer that has crossed the threshold of a certain number of printed pages or copies, for instance, might require maintenance. Quality control and root cause analysis are associated with predictive maintenance. Ask yourself: Do the current conditions of the IT asset suggest that failure is likely or imminent?
From O&M to Analytics
In an article for Techopedia called “IT Infrastructure: How to Keep Up”, David Scott Brown deals with the history of IT hardware and software maintenance. He traces models and methods of IT management, including operations and maintenance (O&M), virtual and cloud computing, and automation and analytics. The old “break-fix” model of IT equipment maintenance is gradually being replaced by a more predictive approach. Now companies like HP offer such concepts as IT Operations Analytics (ITOA).
For a lighter look at the issue, let’s see what science fiction envisions about predictive and proactive maintenance. Consider this selection from the script of Stanley Kubrick’s film “2001: A Space Odyssey”:
HAL: “Sorry to interrupt the festivities, Dave, but I think we've got a problem.”
BOWMAN: “What is it, HAL?”
HAL: “My F.P.C. shows an impending failure of the antenna orientation unit…. The unit is still operational, Dave, but it will fail within seventy-two hours.”
The intelligent HAL9000 computer was doing predictive analysis. Today’s technology is moving toward this model, using whatever machine learning and artificial intelligence it can muster. Automated systems already perform instant notification and even self-healing when things go wrong.
Developing a Maintenance Plan
With all you know about proactive maintenance, creating a good maintenance plan should be easy, right? That depends on a lot of things, including your organizational skills and your technical knowledge. It might help to scour the internet to get ideas for the formation of your plan. For example, Microsoft has online documentation about maintenance plans for SQL server.
But you may not need to look further than the operations and maintenance manuals that come with your hardware or software. Many vendors have excellent maintenance procedures already written and available in their documentation. It would only be a matter of referencing the text in your maintenance procedures.
Whatever plan you develop, we should remind you that the best practice is to integrate it into your change control plan. Whether it is daily, weekly, monthly, or yearly preventive maintenance, or predictive maintenance based on actual conditions, you know that it’s best to plan your work and work your plan.
At the risk of heaping on the cliches, we will add one more to close our discussion: “If you fail to plan, you are planning to fail.” That may have come from Benjamin Franklin, a man of great wisdom. To tie a ribbon on it, let’s just say that proactive maintenance should be a top priority for any IT professional.
Other posts you might like...
The True Costs of Downtime for IT
Downtime is a dirty word in the IT business. Unplanned outages are unacceptable and should not be tolerated. In a universe where customers expect services to be available 99.999% of the time, any time your IT service offering is down is costly to your business.
The Need for Increased Availability is Now
Our predictions for the last half of 2017: Ransomware will keep evolving, the rise of IoT will pave way for increased DDoS Attacks, IPv6 Traffic will continue to grow exponentially, Machine Learning and AI will be applied to enhance security, and the need for increased availability is now.read more
5 Ways to Increase Application Availability
A service provider that offers software-as-a-service or another cloud-based solution should understand what customers are looking for and what compels those very customers to choose an off-premise, “cloud-based” solution vs. the more traditional on-premise, self-hosted solution.read more