Tag Archives: Book

Last Facebook and Twitter book giveaway

OK, this is it.  The LAST giveaway for "You want to do WHAT with PHP?"  The giveaway will occur at the end of the day on Tuesday!  This will also be the second last giveaway for the User Group library.  The rules still apply; if you win I will ask you to cover shipping costs.  See one of my previous posts on what that looks like.

So, to win

  1. Tweet this posting.  The tweet MUST contain the bit.ly link for this page.  I will make the drawing based off of who bit.ly says has tweeted it
     
  2. Become a fan of my Facebook page and Like this posting.  The Like will be how you will be entered.  But remember, the winner will be asked to cover shipping costs.
     
  3. Be the user group leader for your local PHP user group.  Enter by emailing me at [email protected]  You will need to match the Zend User Group List.

Book Description:
PHP is the hottest thing in Web development today, with over 30 million Web sites currently using the technology. Programmers are scrambling to get as much information as they can to stretch their use of PHP to the limits, and this book will help you do just that. In it you will find practical, but atypical, examples of PHP coding.

Author Kevin Schroeder is co-author of The IBM i Programmer's Guide to PHP and the Technology Evangelist for Zend Technologies—The PHP Company—with an in-depth knowledge of the language and an innovative approach to development. He shows PHP solutions and examples you won’t find in other books: binary protocols, character encoding, asynchronous operations, structured file access, daemons, and much more.

There are quite literally hundreds of books of the “build a Web site in PHP” genre, but this is not your typical how-to “PHP 101” book. No, this book is for the creative and the curious. It is a “what’s possible” book. Creative PHP types who have already purchased a book on PHP and want to go beyond the basics into what is possible, not just practical, will find it especially useful. Curious PHP types who have written a fair amount of PHP code but are now ready to go deeper, into the lower-level aspects of PHP, will also be inspired by what this book has to offer.

You Want to Do What with PHP? is like a PHP cookbook. It contains not just theories, but also considerations and varying options on top of simply providing code. With it, you will discover new approaches to problem-solving using PHP, expand your development skill set, and gain valuable assistance toward building ever more advanced PHP applications.

With You Want to Do WHAT with PHP? you will:

    * Become a better PHP programmer
    * Discover unusual, but practical, solutions to problems you will likely face on a weekly, if not daily, basis
    * Learn lower-level programming that is not typical for Web-based applications
    * Discern why operating system level options matter when devising solutions

Contents:
   Chapter 1: Networking and Sockets
   Chapter 2: Binary Protocols
   Chapter 3: Character Encoding
   Chapter 4: Streams
   Chapter 5: SPL
   Chapter 6: Asynchronous Operations with Some Encryption Thrown In
   Chapter 7: Structured File Access
   Chapter 8: Daemons
   Chapter 9: Debugging, Profiling, and Good Development
   Chapter 10: Preparing for Success

This week’s book giveaway

This Friday I will be giving away the second (technically 4th and 5th) copies of my book "You want to do WHAT with PHP?"  The rules still apply; if you win I will ask you to cover shipping costs.  See one of my previous posts on what that looks like.

So, to win

  1. Tweet this posting.  I will change it up a little and say that it needs to contain the bit.ly link for this page.  I will make the drawing based off of who bit.ly says has tweeted it.  The ywtdwwphp hashtag was a little wierd.
     
  2. Become a fan of my Facebook page and Like this posting.  The Like will be how you will be entered.  But remember, the winner will be asked to cover shipping costs.

Book Description:
PHP is the hottest thing in Web development today, with over 30 million Web sites currently using the technology. Programmers are scrambling to get as much information as they can to stretch their use of PHP to the limits, and this book will help you do just that. In it you will find practical, but atypical, examples of PHP coding.

Author Kevin Schroeder is co-author of The IBM i Programmer's Guide to PHP and a Technology Evangelist for Zend Technologies—The PHP Company—with an in-depth knowledge of the language and an innovative approach to development. He shows PHP solutions and examples you won’t find in other books: binary protocols, character encoding, asynchronous operations, structured file access, daemons, and much more.

There are quite literally hundreds of books of the “build a Web site in PHP” genre, but this is not your typical how-to “PHP 101” book. No, this book is for the creative and the curious. It is a “what’s possible” book. Creative PHP types who have already purchased a book on PHP and want to go beyond the basics into what is possible, not just practical, will find it especially useful. Curious PHP types who have written a fair amount of PHP code but are now ready to go deeper, into the lower-level aspects of PHP, will also be inspired by what this book has to offer.

You Want to Do What with PHP? is like a PHP cookbook. It contains not just theories, but also considerations and varying options on top of simply providing code. With it, you will discover new approaches to problem-solving using PHP, expand your development skill set, and gain valuable assistance toward building ever more advanced PHP applications.

With You Want to Do WHAT with PHP? you will:

    * Become a better PHP programmer
    * Discover unusual, but practical, solutions to problems you will likely face on a weekly, if not daily, basis
    * Learn lower-level programming that is not typical for Web-based applications
    * Discern why operating system level options matter when devising solutions

Contents:
   Chapter 1: Networking and Sockets
   Chapter 2: Binary Protocols
   Chapter 3: Character Encoding
   Chapter 4: Streams
   Chapter 5: SPL
   Chapter 6: Asynchronous Operations with Some Encryption Thrown In
   Chapter 7: Structured File Access
   Chapter 8: Daemons
   Chapter 9: Debugging, Profiling, and Good Development
   Chapter 10: Preparing for Success

Read first, then tweet, for a free book

A few weeks ago I blogged that I was going to give away 9 copies of my book, You want to do WHAT with PHP?  Well, today is the day to start.  There are a few ways you can win.

  1. Tweet this page.  You can use the kudos link on the top right of the page, retweet it, or tweet it on your own, but it MUST be under the hashtag of #ywtdwwphp.  If you use the kudos link, that tag will be automatically added to your tweet. That is how I am tracking who to put into the random drawing.  I will be making my drawing around 12:00 noon, central time, on Friday, November 12th.  That's tomorrow.
     
  2. Join my Facebook page for the book.  Around the same time I do the drawing from Twitter I will be choosing a random person from all of the people who like my Facebook page.
     
  3. If you are part of a user group AND the user group has a member library, have the user group maintainer email me.  The information that I receive MUST match the Zend local user group information.  I will make that drawing on Wednesday of next week

I will do this three times over the next three weeks since the publisher gave me 9 copies to give away.

One last thing, you are getting the book for free but you will need to still pay shipping costs if you win.  USPS estimates about $13 to ship almost everywhere.  Cheaper in the U.S.  But, I will use any shipping method that you like.  For example, if you wanted something with a tracking number.  So, before you retweet make sure that you a) know what it will cost to ship to your address, and b) are willing to do that.  I will accept PayPal to cover the shipping costs.  If you win, but don't want to cover shipping costs I will politely thank you for entering, and run the random person selector again.

If you want to increase your odds of winning both RT this page and join my Facebook page.

If you want to, or need to, contact me, email me at [email protected].

Good luck!!

You want to do WHAT with PHP? GIVEAWAY

I just got my copies of my book "You want to do WHAT with PHP?" today.  During my conversations with MCPress, my publisher, I had asked for 3 copies to do a social media promotion and they agreed.  I posted that I would be giving away 3 copies on Twitter and got a whole bunch of "I WANTS".  So I asked my publisher for more copies to give away and they agreed to another 6. So that is a total of 9 copies I have avaialble to give away.

Now the problem I have is; how do I give them away.  So here's what I'm thinking.  First of all, I'd like to open it up to everyone around the world.  But shipping 9 copies of a book to Timbuktu can get expensive, so I will ask that anybody who wins be willing to pay for shipping from my address in the U.S. to the address where you want to receive it.  I believe it will fit into a USPS international mail envelope which looks like it costs $13US to ship anywhere in the world (I tried the UK, India and China and all came up as $13).  So it doesn't look horribly unreasonable.  If the actual shipping costs end up being more I'll give you the option of deciding.

So, with 9 books I'm thinking of splitting it into 3 categories with 3 winners each.  I would split up the winners over the course of three weeks, announcing the winners on each Friday (or thereabouts).  The standard exclusions would apply; family, friends, coworkers (sorry all you Zenders!).  The categories would be:

1) User groups libraries.  What I would do is have the user group leaders submit their group to me via Twitter, Facebook or email.  I would have to verify that it is an actual user group and so I would check it against the list that Zend maintains.  I, technically, manage that list so if you want your user group on it contact me now. 

2) Twitterers.  I will have a blog post that, for you to enter, all you would need to do is retweet the link to the page that I post (and be willing to pay for shipping).

3) Facebookers.  I have a fan page for the book.  I would choose 3 random "Likers" from that page to win a copy of the book.  To win, just "Like" the page and be willing to pay postage.

How does that sound?  If you think it's dumb feel free to make alternate suggestions in the comments.

UPDATED: I would also be willing to accept PayPal to cover shipping charges

UPDATE #2: My twitter handle is kpschrade

You want to do WHAT with PHP? Chapter 10

With the book out and released I now reach the final chapter excerpt that I will have.  As I said in one of my previous chapter excerpts, I did not write this book to cover a wide range of topics.  I wrote it to cover a narrow range of topics, more fully.  But the topics I chose were based off of my experiences as a Zend Consultant for several years.  If you are someone with 2-5 years of experience (the typical requirement for a PHP job) you need this book.  This book was born out of my experience dealing with code written by people with 2-5 years of experience, sometimes more.

This chapter is called "Preparing for success, preparing for failure".  It contains a few pseudo-rules that can go a long way to helping you manage unexpected popularity of your website.  In other words, to help you in minimizing the effects of 2-5 years of programming experience.  :-) Those rules are not complete and there are plenty of exceptions, but knowing these things will help you be more prepared for handling things like load and failure.

   Chapter 1: Networking and Sockets
   Chapter 2: Binary Protocols
   Chapter 3: Character Encoding
   Chapter 4: Streams
   Chapter 5: SPL
   Chapter 6: Asynchronous Operations with Some Encryption Thrown In
   Chapter 7: Structured File Access
   Chapter 8: Daemons
   Chapter 9: Debugging, Profiling, and Good Development
   Chapter 10: Preparing for Success

Preparing for success, preparing for failure

Ironically, preparing for success and preparing for failure require very similar disciplines. But before we go into the technical details, let’s define what success is and what failure is. As I am sure you can see where I am going here, success and failure can actually be the same thing.

Success is when you have spent oodles of your own “free” time developing some kind of web site that you think would be kind of cool. The first couple of days see some pretty decent traffic and you’re relatively happy that people like what it is you’ve done. A few weeks pass and your servers are happily chugging along. There are some minor load issues but nothing an additional server or two can’t handle.

Then, for some reason you start having difficulty sleeping. It could be because you’re starting to become less sure about the stability of your system. Or it could be because that fracking beeper keeps going off because another machine crashed.

Congratulations. You are now successful. This is where the failure starts.

In light of that, here are a couple of things that you can do to be successful, but allow you to get a full night’s sleep. These are not in any particular order and, more importantly, none will guarantee that you will not have problems. Why? That question brings us to our first point.

Expect Failure

The question is not if hardware (or software, for that matter) will fail, but when. I even read an article several months ago from one guy stating that RAID will eventually fail. Always. This is because as disk capacity grows the amount of data that would need to be re-imaged on one disk failure would actually exceed the error tolerance of the new disk.

I don’t know that I take that pessimistic of a view, but it makes for interesting thinking. Do you have enough redundancy that you can still handle load under failure. A common approach is to have two servers. That way if one goes down your web site is still up and running. I believe that to be the wrong approach. If your web application is business critical you need at least 3 machines for each separate application.

There are a couple of reasons for this. The first is maintenance. Your machines will need maintenance. They will break. They will need updates. When you take that machine off your load balancer you need to continue to have redundancy. What if the maintenance that you are doing is emergency maintenance because your motherboard or backplane goes out? The machine is already shut down, you have to drive to the data center, during rush hour of course. Then you need to yank the machine out of the rack and replace the motherboard (or have your hardware support person do it for you). You plug it back in and nothing happens. After 20 minutes you find out it’s because when the motherboard got fried it took a few memory chips with it. Thankfully your hardware support person had some memory chips in the back of the van (you would have remembered to bring them yourself, just in case, right?) and so the memory issue is quickly solved. You boot up the machine and it comes up just fine. Except for the network. You realize that, in your haste, you had forgotten to unplug one of the network cables when you pulled it out of the rack and had damaged the network card. This time the hardware support person doesn’t have the right one.

So now your support person is racing back to their warehouse to get a new NIC and you’re sitting in the data center reading XKCD to try and find some humor in the world.

2 hours later the support person gets back with the proper network card and installs it. The system boots up fine.

But the application doesn’t run. Your application is hardware locked. It’s Sunday night and you won’t be able to get a new key until the next day.

Did I mention that tomorrow’s Cyber Monday?

3 servers. Minimum. If you are concerned about cost, buy 3 smaller ones rather than two medium ones. I hope you never need to learn this lesson the hard way. Thankfully, I learned it the easy way; someone smarter than me told me.

You want to do WHAT with PHP? Chapter 9

There is a bunch I could say to introduce this chapter.  However, I think that by reading the first few paragraphs you will know what I'm talking about.  For those who are experienced developers some of these items might seem a little basic, but there are reams and reams of PHP developers who do not follow several of these rules.

In other news, "You want to do WHAT with PHP?" is now available for purchase in the Amazon store.

   Chapter 1: Networking and Sockets
   Chapter 2: Binary Protocols
   Chapter 3: Character Encoding
   Chapter 4: Streams
   Chapter 5: SPL
   Chapter 6: Asynchronous Operations with Some Encryption Thrown In
   Chapter 7: Structured File Access
   Chapter 8: Daemons
   Chapter 9: Debugging, Profiling, and Good Development
   Chapter 10: Preparing for Success

Debugging, Profiling and Good Development

Permit me to rant for a moment. Not a big rant. Just a little rant. As of writing this sentence I have been a consultant for Zend Technologies for over three years. One of the things I do, among many others, is to conduct audits of customer applications. While it does happen, seldom are audits done “just in case.” Usually there are specific problems that I am tasked with. More often than not they are performance problems. Sometimes the problem is with PHP logic, sometimes with the database, but performance problems nonetheless. I am going to let you in on the secret to why I am good at what I do. It’s not because I’m special or have any type of secret knowledge. It is a button.

[Profile button in Zend Studio]

If you do not know what that button is, it is the Profile button from the Zend Studio toolbar. That is my secret. One of the very first things I do when I look at a new application is start running profiles on it. The audits I do generally last a week and it is imperative that I have a good understanding of what is happening in the application within the first day. It is also imperative that I have an understanding of what the main problems are by the end of the third day. This is done first with the Profiler, then with the Debugger, and then through using Code Tracing.

Don’t think that this is limited simply to the customers that I have worked with. PHP has a reputation of being a sort of script kiddie language. While this is not at all true, many PHP developers have not helped the cause by producing poor code. In many cases, really poor code. What is poor code? I remember reading a quote stating that programmers can spot bad code easily. What they have trouble doing is identifying good code. So it’s a bit of a tough thing to define what “good code” looks like.

I pondered this for a bit and did some Googling to see what others thought. Here’s a list of what I found.

  • Well-organized.

  • Commented

  • Good naming conventions

  • Well-tested

  • Not "clever".

  • Developed in small, easy to read units

In fair disclosure, I stole this list from Stack Overflow. Someone else wrote it on Dec 14, 2008. I think that the list is accurate. A lot of bad code misses the first item, meaning that it is not well organized or structured. Jackson Pollock would never have made a good programmer. While some claim it, I would tend to disagree that coding is an art form. It is definitely creative, but one of the purposes of art is to convey an idea via different and new mediums. Art doesn’t have to be maintained by someone else. Code does, and you should not have to “study” code to discover its meaning. The meaning should be plain and the code should be structured.

However, even a lot of code that is considered “well architected” fails on fifth point. Some of the worst performing code I have seen was because someone was trying to be too smart. The code fit all of the other parts but missed the part about not being clever. Overly clever code is, unfortunately, a bit difficult to document. One definition could be code that tries to guess what is going to be needed later on in a request; front loading program logic instead of loading it lazily. Another possibility could be using multiple levels of variable variables. Yes, you have to know how they work for the Zend Certification exam and, yes, they are useful to handle an abstraction layer. But each level deeper that you use a variable variable you decrease the readability of your code by an order of magnitude. Another example would be the overuse of magic methods.

My definition of what good code looks like is what is a good balance between reviewed “stream of consciousness” code, or spaghetti code, and the over-architected solutions. “Reviewed stream of consciousness” is code that you wrote in moments of inspiration that you have gone back over and re-examined, re-factored and unit tested. In the past, a lot of PHP code was written as a stream of consciousness. More recently though, in many ways, we are going to the other side of the pendulum and making our code so complicated that it’s just as difficult to work with. Additionally, many modern developers seem to have the idea that you can click a button, or run a command, or type some pseudo code and because our computers are so powerful we can have our application all but built after the command has been run.

This brings me to Kevin’s Framework Observation. It states that “the effects of the law of unintended consequence are inversely proportional to the ability of a developer to build an application without a framework”. Why is that? Because many times, when a developer starts using a framework they will often check their brain at the door. I cannot tell you how many times I’ve wanted to respond to people posting on one of several framework mailing lists that the answer to their question will be found by simply reading the framework source code. You are working with PHP, therefore you are working with open source code. If you are having a problem with understanding why a framework is doing something a certain way, pull out the debugger, which you should already be using prolifically, and find out why it’s doing that. One of my colleagues stated “You shouldn’t be using a framework until you know you need a framework.” Once you know you need a framework it is much more likely that you will have a full enough understanding to properly harness the framework. Don’t “use” a framework; harness it.

You may think that I am anti-framework after reading that, but that is most definitely not the case. I like frameworks because I am more productive when I use them. However, when I use a framework I make sure I know what is going on behind the scenes. Are you using Active Record, loading up a thousand objects to do some kind of calculation and wondering why your page is slow? Then you don’t understand the concepts of Active Record.

Most of the time, the problems that I have seen are because the developers do not know what is happening in their application. There really is no excuse for this. There are two main debuggers available, both of which support profiling functionality and both of which are free. For PHP they are the Zend Debugger and XDebug. If you are not familiar with those, or do not know how to use them DO NOT read any other chapters in this until you are proficient at debugging and profiling. Now, don’t get me wrong. PHP allows you to throw code at a wall and make it stick. But the developer is ultimately responsible for what gets written.

Every single one of the examples in this book has been run using the debugger. A lot of the code in the other chapters should be run by using a debugger so you can understand the actual logic flow and variable flow for the individual method calls. Even if you build something right the first time it is a good idea to go through it with the debugger. Why? Because if you built it right the first time there’s probably something wrong with it. To be honest with you, if I write some section of code logic that is beyond simple variable modification and it runs the first time through, my fear is that I did something wrong. And you know what? About half the time that fear is correct. Always check your work. A debugger is great for that.

This chapter is going to look at 4 different methods for gaining insight into your application and making changes more predictable. 1) Debugging, 2) Profiling, 3) Code Tracing and, 4) Unit Testing. We will also look at some steps that you can take on your local operating system to get lower level information about what it takes to get your application serving a request. All of the examples are going to be done with Zend Studio 7.1, but some of this functionality is available in the Eclipse PDT project (the most popular project on the Eclipse website as of this writing) as well.

You want to do WHAT with PHP? Chapter 8

PHP is a langauge generally not suited for running daemons.  That said, PHP can do it, and in certain circumstances does it sufficiently for the job.  In this chapter we look at some of the things you need to know about to build a PHP-based daemon.  This excerpt doesn't feature any code, but it does set the foundation for why I think PHP is fine for daemons in some circumstances.  Later in the chapter we get into the code.

   Chapter 1: Networking and Sockets
   Chapter 2: Binary Protocols
   Chapter 3: Character Encoding
   Chapter 4: Streams
   Chapter 5: SPL
   Chapter 6: Asynchronous Operations with Some Encryption Thrown In
   Chapter 7: Structured File Access
   Chapter 8: Daemons
   Chapter 9: Debugging, Profiling, and Good Development
   Chapter 10: Preparing for Success

Daemons

In reading this chapter you might get the idea that I am saying that you should be using PHP for building large scale, daemon-based applications. While I definitely am of the opinion that PHP can be used for that, PHP was not designed for it. For me, whether or not PHP can be used as a daemon comes out of the “use the right tool for the job” discussion that often happens online. Usually it comes up as an argument after one person bested another in some kind of language shoot-out. The looser of the argument (regardless of the actual merits of their chosen language) half-concedes by saying “Well you just use the best tool for the job.” Meaning, “You got me, but I won’t be made to look the fool.” Or “It’s not worth arguing with this idiot.” Chances are it’s the latter.

The problem is that there are lots of jobs that need to be done that are outside of the specialty of a given language. Following the “best took for the job” mentality you might have part of infrastructure running PHP, part of it running Ruby on Rails, part of it running .NET and part of it running Java. Nothing against any of those languages, but absolutely NONE of them do EVERYTHING well. Java is a good example. I personally like programming in Java. However, to have a web site running in Java you need to have layer upon layer upon layer of application server upon application server upon application server to make it run. Will it run? Yes. Will it run fast? Yes. Are web pages what Java was designed to do? With respect to JSP developers, no. Same thing with Ruby. Ruby is really good if you can afford to have very little flexibility. The benefit of Ruby on Rails comes at the price of inflexibility. If you need specific control over aspects of your application, Ruby is probably not the best choice for you.

Where am I going with this? The question of using “the best tool for the job.” A heterogeneous environment is not a good one to build for or manage. Getting one of those to work should be the goal of academics. Real life, with all its warts, will do much better with a fuller understanding of a smaller subset of features. This is because as soon as you have more than one type of architecture in your organization you now need to have more than one skill set. Having more skill sets means that it will be difficult to find experts. Having fewer experts means more time spent on support calls with other people whose knowledge base is also too wide.

What is the solution to this? I would contend that there is not a solution. Just like there is no vehicle that serves all needs, there is no programming language that serves all needs. Additionally, the people you have around are important factors. If you have developers who are good, but not great, at PHP, Ruby, Python, Java and .NET (who also need to know Flash), where will you have the skill set internally to handle a problem for which there isn’t an answer on Google? It’s like buying a Peugeot in Wyoming. I would contend that if you have several languages and infrastructures that you support, your developers are going to have more problems and be less creative. Creativity for the sake of creativity is not good. However, creativity allows for innovative solutions to difficult problems. Far too many people pick up a language and work with it, thinking that they are experts because someone else said the language was cool. However, the more languages you know the less of an expert you will be in each.

This takes us back to the earlier statement; using “the best tool for the job”. I would contend that this is wrong, or at least, problematic. I would go the route of saying “use the best tool for the organization” instead. That organization could be your place of business, a non-profit you work for, the website for a friend who’s starting a new business, your church, etc. How do you decide which tool you are going to use for a specific problem?

How you make the decision on is not set in stone. In fact, it’s unlikely that your organization made an intentional decision beyond “Hey, we’re building a website; we should use PHP, right?” If your organization has a significant web presence, then that is probably the right decision. PHP does web good. But what if you need something else? Say, for example, you need a simple message queue. You have several options. One is to obtain some kind of third party messaging software. There are plenty of options available for you, and they are probably going to be better than what you would build.

But there is often a problem with taking off the shelf software, both proprietary and open source. Half the time it’s complicated enough that once you install it you need even more experts to manage it. So, say you have simple needs, you need a message queue but your organization does not have the skill to support another language.

That is where building a PHP daemon comes in.

Starting Out

If we’re going to be building a daemon one of the primary purposes is to build something that can handle multiple tasks at one time. To do that we need to be able to handle multiple connections at once. Building an application that can handle multiple connections at once is actually very simple. So simple, in fact, that we’ve already done it back in the chapter on Networking and Sockets. However, there are two problems, one old and one new. And both deal with how to use the resources on the system as best as possible.

The old problem is the question of how do you efficiently use CPU time when you may have significant parts of your work that require other services, and thus, have wait times. These can be things like database calls, network calls or even file system access. Even though disks are relatively fast these days IO contention does occur. This isn’t just a problem that happens for larger organizations with high performance environments but also for smaller shops who have inefficient data structures on disk, or just high performance data requirements. So, how do you do other things while waiting for slower things to do their thing?

 

You want to do WHAT with PHP? Chapter 7

Most PHP developers are used to dealing with files.  Files that are uploaded, downloaded, etc. If we work with data files it is usually in the form of XML or CSV or something like that.  But what if the files that users were uploaded and downloading had information in them that you wanted to get.  Say that you were hosting MP3 files on your website that people could upload.  You might want to get the ID3 information that states who has the copyright.  Or if people were uploading Word documents and you wanted to get author information.  There are often libraries available to read certain file formats in PHP, but more often than not, there isn't.  The purpose of this chapter is to get you started in being able to read and understand binary files.  Even if you aren't using them directly in your application, knowing how to read them is a good exercise since there is a good chance that at some point you will need to be able to work with them.  Even if it's something that you would be writing a one-off script for to do some basic data transormation, knowing how to access binary files is a good thing and, as I said earlier, a lot of PHP developers don't do this.

In this chapter we go through the basics of accessing structured files.  We start with TAR files move to WAV files and then we write a read-only interface to an EXT2 file system.  You'll never do that in a production environment, but by looking at it you might learn a bunch of things.  Plus a lot of PHP developers write their applications without having any understand of how it will affect storage.  The EXT2 file system example will help you.  Then we wrap up by writing our own binary file for writing linked lists with durable storage.

   Chapter 1: Networking and Sockets
   Chapter 2: Binary Protocols
   Chapter 3: Character Encoding
   Chapter 4: Streams
   Chapter 5: SPL
   Chapter 6: Asynchronous Operations with Some Encryption Thrown In
   Chapter 7: Structured File Access
   Chapter 8: Daemons
   Chapter 9: Debugging, Profiling, and Good Development
   Chapter 10: Preparing for Success

Structured File Access

In the desktop world, structured files are relatively commonplace. But in the web world we tend not to deal with them very much. The reason for this is that we usually end up dealing with structured data via a database. Often it doesn’t make much sense for a web developer to store data structured according to a proprietary format. A database does for us what we generally need to do. Additionally, we tend to work with string data as opposed to binary data, which is something that structured data files tend to use more.

But there are some times when knowing how to figure out the internal format of a file can be useful. Other times being able to write to those files, or even write your own format could be beneficial. And, like with networking, it is just good to have an understanding of how to do things that aren’t in your regular tool belt.

If you are not familiar with structured files, read this chapter slowly. There is a lot of detail and it is very easy to get lost. So read it slowly, take breaks and try writing out some of the code yourself. And don’t expect to get it all at one shot. This chapter actually took me a very long time to write. Don’t expect to understand it all at once. In fact, it would probably be a good idea to read each individual section separately and intersperse other chapters in between moving forward on this one. This chapter will probably be the most difficult one to get through, so take your time.

Tar Files

To start with, let’s look at some file formats with open standards. Often files will have a file header. This will contain meta data about the file. Depending on the file this meta data could be file version number, author, bitrate or any number of other parameters.

The tar format is short for “tape archive”. It was initially used for the purpose of storing backup data on tape drives but, as any developer who touches a Linux system knows, it has expanded well beyond that use.

The tar format is a relatively simple format that allows individual files to be stored in one file for easy transport. The use of gzip has become virtually synonymous with tar, though we will not look at that in great depth simply because gzipping a tar file is just the simple act of taking the raw tar file and compressing it.

Let’s first look at an existing tar file containing the source code for PHP 5.2.11. The tar file headers are actually just simple text strings but they are stored in a structured format. In other words, they are just text strings, but they are fixed length text strings, similar to a CHAR text field in SQL.

Before going into the actual file itself, here is the structure of a file header record.

 

Offset Size Description
0 100 File Name
100 8 File Mode (permissions)
108 8 Numeric User ID
116 8 Numeric Group ID
124 12 File size in bytes
136 12 Last modified Unix timestamp
148 8 Header checksum
156 1 Record Type
157 100 Linked file name

Figure 7.1 Tar header record format

The Record Type can be one of 7 different values.

 

Value Type
0 Regular File
1 Unix link
2 Unix symbolic link
3 Character Device (virtual terminal, modem, COM1)
4 Block Device (disk partition, CD-ROM drive)
5 Directory
6 FIFO or named pipe

Figure 7.2 Tar record types

Reading a single header record is quite easy, as we’ll show in the following code. The tar block size is 512 bytes and so even though we only use about 250 bytes, we read the entire 512 byte block. As you look at more structured files, this block based approach will be a very common occurrence.

$fh = fopen('php-5.2.11.tar', 'r');
$fields
= readHeader($fh);
foreach
($fields as $name => $value) {
    $value
= trim($value);
    echo
"{$name}: {$value}n";
}

function readHeader($resource)
{
    $data
= fread($resource, 512);
    return
strunpack(
        '100name/8mode/8owner/8group/'
        .
'12size/12ts/8cs/1type/100link', $data);
}

function strunpack($format, $data)
{
    $return
= array();
    $fieldLengths
= explode('/', $format);
    foreach
($fieldLengths as $lens) {
        $name
= preg_replace('/^d+/', '', $lens);
       $lens
= (int)$lens;
       if
(ctype_alpha($name)) {
          $return
[$name] = substr($data, 0, $lens);
       }
else {
          $return
[] = substr($data, 0, $lens);
       }
       $data
= substr($data, $lens);
       if
(strlen($data) === 0) {
          break
;
       }
    }
    return
$return;
}

Figure 7.3 Reading the Tar header

Most of this code is simply there to make it easier to read string-based data. unpack() will return an array of individual characters and not full strings, so this method gets a little cumbersome when dealing with anything beyond simple string operations. That is the purpose of the strunpack() function. It takes characters that are returned individually and groups them in a single record. You might think that you could use something like fscanf(), but %s does not like NULL characters. Since there are many NULL characters in a tar file this will not work well for us. So most of this code is here to handle reading the file information but we will use it a fair amount later on.

The output for this code is

name: php-5.2.11/
mode: 0000755
owner: 0026631
group: 0024461
size: 00000000000
ts: 11261402465
cs: 012435
type: 5
link:

You want to do WHAT with PHP? Chapter 6

One of the things that I think PHP developers do not do well is asynchronous processing.  PHP developers have written reams of applications that do all their calculations up front and over-and-over again for multiple requests.  Or they will just write their code to work linearly, regardless of the scalability implications.  In this chapter I wrote a simple example showing how you can do some asynchronous processing.  It is a basic example that I use and there is a LOT more I could have talked about, and perhaps I should have.  But this example will get you started thinking about how to architect your application so that you can greatly increase the scalability of your application.

   Chapter 1: Networking and Sockets
   Chapter 2: Binary Protocols
   Chapter 3: Character Encoding
   Chapter 4: Streams
   Chapter 5: SPL
   Chapter 6: Asynchronous Operations with Some Encryption Thrown In
   Chapter 7: Structured File Access
   Chapter 8: Daemons
   Chapter 9: Debugging, Profiling, and Good Development
   Chapter 10: Preparing for Success

 

Asynchronous Operations with some Encryption thrown in

One of the primary ways if increasing the ability of your website to scale is that of asynchronous operations. Modern web applications have a lot of logic to execute for each request and the effect of this is often slow requests. However, web requests are intended to be snappy. This is partially due to the demands of users but also because web sites need to be able to serve up content to large numbers of people on a moment’s notice.

The problem of asynchronous operations is going to have a different solution depending on the type of application that is being written. However, what we’re going to do is look at an example of how you could implement this in your application. I would not suggest that you take this code and throw it into a production environment, but at the same time, the concepts that we demonstrate here are quite pertinent to what you may need in a production environment.

We are going to use an example that uses a combination of different applications. The example application that we’re going to build is a simple credit card processor. This is probably the most common need for asynchronous operations that I have seen. It may not be the most interesting example, but it probably is the most pertinent. Also, because of the whole credit aspect we need to throw encryption in. When you first read the title to this chapter you may have thought “how in the world are these two related?” Well, credit card processing is why. We could have done something like search to show an example, but some other company beat me to it.

Another question you may have is “why is this even a problem? Couldn’t you just handle this in the web server?” The answer is “yes, you could.” But there is a caveat to that. The caveat is “if you are not concerned with scalability or high load.” For the most efficient use of a front end web server you do not want to have it sitting idle waiting for PHP to handle the response to come back from the credit card processor. One of the reasons for this is that if you have a high number of transactions you can actually run out of spare processes to serve other requests. Without having some kind of asynchronous processing going on, your front end web servers could be overloaded while being completely idle if someone put in a Luhn-validated, but inactive, credit card number. This would be a very easy to implement denial of service attack. But by offloading any long-running functionality to a backend queuing mechanism, the front end can remain snappy while any load is managed in the back end.

This brings us to what our architecture will look like. It will be relatively simple. Our front end will be the Apache/PHP instance that you know and love. The backend credit card handler will be running Zend Server’s Job Queue. The Job Queue is a feature in Zend Server 5.0 and above that allows you to run jobs asynchronously from the front end. We could write our daemon for this example, but we will cover daemons in another chapter. Here we just need something that already works.

The way this works is that a form is submitted from the browser. This form should contain all of the information needed to do a valid credit card transaction. For processing the credit card transaction we are going to be using Authorize.net. Why? Because it came up first in a Google search and doesn’t require you to have an actual merchant account to test its API.

Once the credit card form has been submitted, the web server will do two things. First it will call the Job Queue and pass the credit card information. Then it will render a page that contains some JavaScript that will connect to the message queue. This is the point where our logic separates.

The web browser, once it has the HTML, will make an Ajax request not to the web server, but to the message queue. Feel free to use any message queue that you want. However, it should support some kind of HTTP request. However, what we’re going to do is build a very small queue proxy whose only purpose will be to broker requests from the back end job queue to the front end listener. It will not have durability or any of the other features that a typical message queue would have. There are other message queue software packages available, but they tend to be a little difficult to set up for the simple demonstration we’re going to have here. But if you end up using the methods described here I would recommend taking a look at some of the message queues that are already available. Many of them already have HTTP-based messaging available.

Getting the web page to properly display the data from the message queue is actually a bit of a pain due to the browser-domain, so-called security features. This is especially if you just want a simple example that could possibly be re-used. It’s quite do-able, as you can see, but because of the security restrictions in the browser you may need some patience. The message queue’s HTTP API is the reason for this and it’s not because of HTTP, but because of your browser. It doesn’t matter if you are on the same IP address with a different port or on a sub domain, your browser is going to complain about cross-domain permissions issues.

In this example we are using JQuery 1.3.2 to handle processing the remote data. The way we do that is by taking the output of the job we run, and wrapping it in a mod_rewrite call which encapsulates the output in a JSON object which is then attached to the window. What this basically means is that the output is attached to the browser by actually making it part of the JavaScript code. It’s not the most elegant solution, but it’s simple, it works and it doesn’t have cross-domain issues.

You want to do WHAT with PHP? Chapter 5

Quick!  Raise your hand if you know the most underutilized feature in PHP?  If you're thinking type-juggling you're wrong (that's probably the most over-utilized feature).  It is, in my mind, SPL.  If you are doing any data processing whatsoever you are using arrays.  And most likely you are doing database queries, iterating over the results and doing your algorithm-ing.  But what if you have additional functionality that you need to have integrated with your data.  You could go the traditional route and copy and paste half your application around or you could build, what we like to call structured applications.  SPL allows you to do that.  How?  Well, that's one of the reasons why I wrote the book "You want to do WHAT with PHP?".  Here's your excerpt…

   Chapter 1: Networking and Sockets
   Chapter 2: Binary Protocols
   Chapter 3: Character Encoding
   Chapter 4: Streams
   Chapter 5: SPL
   Chapter 6: Asynchronous Operations with Some Encryption Thrown In
   Chapter 7: Structured File Access
   Chapter 8: Daemons
   Chapter 9: Debugging, Profiling, and Good Development
   Chapter 10: Preparing for Success

… snip

This code prints out

rewind()
valid()
current()
Found: string
next()
valid()
current()

Take a little time to look over the output and compare it to the code that we had written. There are a couple of things that you can glean from this output. First of all, you can see that the rewind() method is the first thing called. That tells you that foreach will always attempt to iterate over all items in this object. However, the fact that you have control over the execution of that code means that you can decide if you want to honor that or not. Generally you should, but if your object is a linked list that should not go back to the beginning, you can enforce that functionality in the object.

So far we’ve only looked echoing out the method names. So let’s take a look at a simple implementation of what this would look like.

class User
{
    public
$name;
}

class UserCollection implements Iterator
{
    private
$_users = array();
    private
$_position = 0;

    public function __construct()
   {
      $names
= array(
         'Bob'
,
         'David',
         'Steve',
         'Peter'
      );

      foreach ($names as $name) {
         $user = new User();
         $user->name = $name;
         $this->_users[] = $user;
      }
   }

   public function current()
   {
      return
$this->_users[$this->_position];
   }

   public function key()
   {
      return
$this->_position;
   }

   public function next()
   {
      $this
->_position++;
   }

   public function rewind()
   {
      $this
->_position = 0;
   }

   public function valid()
   {
      return isset
(
         $this
->_users[$this->_position]
      );
   }
}

$col = new UserCollection();
foreach
($col as $num => $user) {
   echo
"Found {$num}: {$user->name}n";
}

Figure 5.4

This code outputs

Found 0: Bob
Found 1: David
Found 2: Steve
Found 3: Peter

No surprises. But there’s a problem here. From a practical standpoint, it is not very beneficial because it doesn’t really do anything different from a regular array. So why would you use it? One reason is that this type of functionality is quite useful when you want to load things on demand. Lazy loading. Laziness is a programmer’s virtue, except when it comes to leaving the coffee pot empty. So what could this look like from a practical standpoint?

Let’s start with a database table.

Web Analytics