With the book out and released I now reach the final chapter excerpt that I will have. As I said in one of my previous chapter excerpts, I did not write this book to cover a wide range of topics. I wrote it to cover a narrow range of topics, more fully. But the topics I chose were based off of my experiences as a Zend Consultant for several years. If you are someone with 2-5 years of experience (the typical requirement for a PHP job) you need this book. This book was born out of my experience dealing with code written by people with 2-5 years of experience, sometimes more.
This chapter is called "Preparing for success, preparing for failure". It contains a few pseudo-rules that can go a long way to helping you manage unexpected popularity of your website. In other words, to help you in minimizing the effects of 2-5 years of programming experience. 🙂 Those rules are not complete and there are plenty of exceptions, but knowing these things will help you be more prepared for handling things like load and failure.
Chapter 1: Networking and Sockets
Chapter 2: Binary Protocols
Chapter 3: Character Encoding
Chapter 4: Streams
Chapter 5: SPL
Chapter 6: Asynchronous Operations with Some Encryption Thrown In
Chapter 7: Structured File Access
Chapter 8: Daemons
Chapter 9: Debugging, Profiling, and Good Development
Chapter 10: Preparing for Success
Preparing for success, preparing for failure
Ironically, preparing for success and preparing for failure require very similar disciplines. But before we go into the technical details, let’s define what success is and what failure is. As I am sure you can see where I am going here, success and failure can actually be the same thing.
Success is when you have spent oodles of your own “free” time developing some kind of web site that you think would be kind of cool. The first couple of days see some pretty decent traffic and you’re relatively happy that people like what it is you’ve done. A few weeks pass and your servers are happily chugging along. There are some minor load issues but nothing an additional server or two can’t handle.
Then, for some reason you start having difficulty sleeping. It could be because you’re starting to become less sure about the stability of your system. Or it could be because that fracking beeper keeps going off because another machine crashed.
Congratulations. You are now successful. This is where the failure starts.
In light of that, here are a couple of things that you can do to be successful, but allow you to get a full night’s sleep. These are not in any particular order and, more importantly, none will guarantee that you will not have problems. Why? That question brings us to our first point.
The question is not if hardware (or software, for that matter) will fail, but when. I even read an article several months ago from one guy stating that RAID will eventually fail. Always. This is because as disk capacity grows the amount of data that would need to be re-imaged on one disk failure would actually exceed the error tolerance of the new disk.
I don’t know that I take that pessimistic of a view, but it makes for interesting thinking. Do you have enough redundancy that you can still handle load under failure. A common approach is to have two servers. That way if one goes down your web site is still up and running. I believe that to be the wrong approach. If your web application is business critical you need at least 3 machines for each separate application.
There are a couple of reasons for this. The first is maintenance. Your machines will need maintenance. They will break. They will need updates. When you take that machine off your load balancer you need to continue to have redundancy. What if the maintenance that you are doing is emergency maintenance because your motherboard or backplane goes out? The machine is already shut down, you have to drive to the data center, during rush hour of course. Then you need to yank the machine out of the rack and replace the motherboard (or have your hardware support person do it for you). You plug it back in and nothing happens. After 20 minutes you find out it’s because when the motherboard got fried it took a few memory chips with it. Thankfully your hardware support person had some memory chips in the back of the van (you would have remembered to bring them yourself, just in case, right?) and so the memory issue is quickly solved. You boot up the machine and it comes up just fine. Except for the network. You realize that, in your haste, you had forgotten to unplug one of the network cables when you pulled it out of the rack and had damaged the network card. This time the hardware support person doesn’t have the right one.
So now your support person is racing back to their warehouse to get a new NIC and you’re sitting in the data center reading XKCD to try and find some humor in the world.
2 hours later the support person gets back with the proper network card and installs it. The system boots up fine.
But the application doesn’t run. Your application is hardware locked. It’s Sunday night and you won’t be able to get a new key until the next day.
Did I mention that tomorrow’s Cyber Monday?
3 servers. Minimum. If you are concerned about cost, buy 3 smaller ones rather than two medium ones. I hope you never need to learn this lesson the hard way. Thankfully, I learned it the easy way; someone smarter than me told me.