ORRacle

Systems Update

Comments: 0

There is a lot of information here, but I wanted to give everyone a comprehensive update on our current systems challenges.

For those of you who want a quick look at where we are with getting our systems up and running, you can log into the System Status Site set up by Kody Gerber and get an easy-to-understand view of what is and what isn’t running at the present time. You can find that site at: https://sites.google.com/orrprotection.net/systems-status/home (Note: You will have to be logged into your Google account to see this information.)

The details

In the early morning hours on September 15, IT personnel noticed unusual activity and data damage to several systems on our network. It’s been almost a month, and I know there are many questions regarding what happened and why we are not back to normal. Most of the communication about this incident has been managed by our corporate counsel, Jim Herr, and we very much appreciate Jim’s assistance in communicating to you so we could concentrate on getting things back to some semblance of normal.

Lots of rumors and speculation have been floating around about what happened. I get that, and to make matters worse there will probably be many questions we will never be able to fully answer. We don’t live in a TV show where problems are simple and solutions are generated in 42 minutes. And honestly, I think the most important question people are asking is, when can we get back to work?

Here is my best attempt at describing why we are not back to normal yet:

Our network is a very complex network of interconnected systems. Most of these we rarely think about - we just log in and Sage, ServiceMax, NetReport, email, Excel, along with our personal banking, FaceBook, Instagram, YouTube and many other services just work. So when they stop working it can be very frustrating and confusing.

One of the most common questions I’ve been asked these past few weeks is, “Don’t we have backups of all our data? Why not just restore the backups?” The answer to that question is, “Yes, we have copies of all of the data”. But restoration is not a simple process. Let me try and explain.

Imagine a jigsaw puzzle with 10,000 pieces. Off to the side you have another copy of that puzzle. Now your two year old daughter comes in and carries away a bunch of the puzzle pieces. In addition, she starts feeding puzzle pieces to your lovable but anxious terrier dog. For fun, she tears a few in half. (You’re suitably impressed with her dexterity and strength but unhappy about the damage.) And because she is bored, and doesn’t understand the concept of puzzles, she gets a box with other puzzle pieces and throws them into the pile of puzzle pieces.

When you get to the puzzle, you don’t know what pieces are missing, damaged, or just plain don’t belong. You think, “I’ll just wipe all this mess away and rebuild the puzzle with the spare box.” That seems reasonable and easy, but then you realize a lot of the work already done will be lost. And if you start all over again, it will take a long time to rebuild what has been lost. So you say, “I’ll just use the pieces from the other puzzle to fill in what is missing from the original.” But you soon realize you need to examine each piece of the puzzle and determine:

  • Does it belong or not?
  • Is it already there, meaning you have a duplicate piece?
  • If we need it, where does it go?
  • Can we use it now, or do we have to wait for some other pieces to be put in place first? 

It may sound easy, but there are good reasons IT people hate doing restores. They are long, painful, and rarely work as desired.

Nice story, but when can I go back to work?

Not all the systems are up yet, and some that are working are on borrowed equipment, so you’ll notice the programs working slower than normal. Furthermore, we have added some security tools and processes into the network.

Virtually all of you should be able to work today. There are some of you with special functions and programs we are still working on. To see what systems are currently working, log into the Systems Status website put together by Kody. You can find that site at: https://sites.google.com/orrprotection.net/systems-status/home

What about VPN and S Drive access?

We are planning additional security measures and training to make things less susceptible to damage. Among these changes are:

  • VPN access will be primarily limited to Sage users. If you don’t use Sage, chances are pretty good you will not need to use the VPN.
  • All network drives are being migrated to Google Shared Drives. You will be notified when these migrations have been completed.
  • Branch facilities will no longer have local servers. Print functions will be handled by cloud-based print drivers, and file storage will be limited to Google drive. And please, don’t store data on your local computer. If your computer is lost or compromised, we will lose that data forever. Data stored in Google is protected and backed up.
  • We will soon release a series of security videos which every employee will be required to view. A network system is like an apartment complex, with every resident having keys to get inside the building. Most intrusions occur when a resident intentionally or unintentionally gives bad people access to their keys. We’ll help train you on how to keep your keys in your pocket.
  • We have implemented multi-factor authentication. This will make it more difficult to log in, but it will make it less likely an intruder will be able to get into our system even if they have your password.
  • You will be asked for more frequent and more complex password changes. We are currently looking at some password assistance methods to reduce the problems associated with this policy change.  
  • We are currently rebuilding the data warehouse, used for more complex financial reporting. We hope to have it up and running early next week.

Special thanks to some outstanding people.

One thing we had that really saved our bacon was a team of dedicated IT professionals who did not let discouragement keep them from getting to work. The most superhuman effort that got us back to where we are came from the efforts of Bob Neville. There were other monumental efforts, but I want everyone to know that had it not been for Bob’s dedication and sacrifice of personal time, and his brilliant solutions to some impossible problems, we would still be down today. I don’t mean to diminish the efforts of others, like Jonathan, Sean, Kody, Lisa, Thomas, Ben, Jerry, Phil, and Jim Herr (who was made an honorary member of the IT Team). During this crisis we have met twice a day, starting and ending each day with a group huddle, and many of us working until well past midnight and on weekends. I also want to thank those of you who have offered words of encouragement and assistance. It was greatly noticed and appreciated. 

Conclusion

We will be working on pieces of our system for weeks to come, but most of what is needed to get back to work has been restored. As we progress, we are not only getting things back to normal, but we are building into the system measures that will help us prevent this from happening again. But we will need your help. Thank you for your patience and support, and please help us all out by learning what you can when we release the security training modules. As always, if you see something unusual or strange, don’t hesitate to contact us. You won’t get attitude or judgment if it turns out to be nothing. You will get appreciation if it helps us identify a problem. We are still pretty busy, so we will get to you as soon as we can. Please continue to report issues and ask questions on the System Disruption Channel. Please be patient with us - we are pedaling as fast as we can. Thank you for your understanding.

About the AuthorBruce Aldridge

Bruce is the Vice President of Information Technology and ePMO for ORR Corporation.

prev
Next