SimpleCloud Part 1 – Setting the stage

Leave a comment

Earlier in December I did a webinar on the Zend PHP Cloud Application Platform.  It's not some new product or anything like that, but rather a view of how our software is going to fit together.  It's not something that will be "released" in the typical software fashion.  Instead it is the mindset of our product development teams when they look at building new features.  Cloud-based pricing for Zend Server, AWS/Cloud integration in Zend Studio, and, of course, SimpleCloud.

SimpleCloud is an initiative started last year (2009) for the purpose of allowing you to build cloud-portable applications.  In other words, you would be able to build an application on your local machine and (mostly) transparently work on any of the three supported cloud platforms.  The example application I built for that webinar was one that used, not just "the Cloud", but all of the cloud services available in SimpleCloud, the Zend Server Job Queue (to scale data processing) and, of course, Studio with it's AWS integration.

The application was one that took an uploaded image and resized it.  Simple enough, unless you want it to scale.  The example application that I wrote can theoretically scale to quite high heights.  Not because I'm a great programmer, but because I utilized the underlying architecture of people smarter than me.  That's kind of what the cloud is.  Do you have the expertise to ward off a massive, worldwide DDoS?  Apparently Amazon does.  One of the prime rules of being human is to not only know your strengths, but know your weaknesses.  Humility is very hard for humans to do, and allowing for the fact that someone may be better than you at something is hard to admit.

The purpose of this application was to demonstrate how you can build an application a) for scalability, and, supplementally, b) for the cloud.  It's definitely not there to be pretty.  🙂  So what it does is implement several cloud-based features.  You could implement all of these on your own, but doing so (especially if you are a business) would probably cost you more.  Part of the cloud's appeal is that someone else is the specialist.  Could you use RabbitMQ?  Sure.  But then you have to manage it.  Could you have a massively distributed file system?  Sure! But then you have to manage it.

When you boil it all down; when you distill it to it's essentials; when you reduce it to it's finest ingredients, the cloud is just an on-demand managed service provider.  Nothing more.

So, what does this application do?

  1. It receives an image to be uploaded
  2. It stores this image on a file system
  3. Executes a job on the Zend Server Job Queue to resize the images
  4. Communicate with the browser, letting the end user know which image sizes have been processed
  5. Browse files with meta data
  6. Download resized files

Could you do all of that on your own?  Sure.  Could you do it for a couple of thousand users?  Sure.  Could you do it for a couple of thousand users who all decided to upload their images at the same time?  Nope.  Probably not.  The cloud isn't just about scalabilty, but elastic scalability.  And the chances are pretty high that you are not good at that, unless you are a large company with loads of resources to call upon.

So let's, then, take a look at what this looks like.  Check the "Related" panel for the link to part 2.

SimpleCloud Part 2 – The Job Manager

Leave a comment

In the previous installment I talked a little about the cloud, what Zend is doing in the cloud and what the example application for my ZPCAP webinar did.  One of the primary characteristics of scalability is the ability to process data as resources are available.  To do that I implemented the Zend Server Job Queue with an abstraction layer that I’ve written about three different versions for.  I think the fourth will be the charm :-).

The Zend Server Job Queue works by making an HTTP call to a server which will execute a PHP script.  That HTTP request is the “job” which is going to be executed.  The job is simply the Job Queue daemon pretending to be a browser.  While that works pretty well I prefer a mechanism that is more structured than simply running an arbitrary script.  Having small, defined, structured tasks allow you to spread those jobs over many servers quite easily.

So what I did was write a management system that is relatively simple which allows me to define those tasks and execute them on pretty much any server that is behind a load balancer.  And on the cloud, that load balancer can have a thousand machines behind it AND it can be reconfigured without changing your application.  One of the keys of elastic scalability is that you can throw an application “out there” and it will “work”.  That is why the Zend Server Job Queue is a good idea in the cloud.  Because it uses a protocol that requires one entry point to be defined and the rest is up to the infrastructure to work out.  (I personally am of the opinion that PHP developers are too dependent on config files).

There are two parts to this manager.  1) the queueing mechanism, 2) the executing mechanism.  Both are handled in the same class, named comzendjobqueueManager.  When a job is executed, it does not execute, it sends a request to the load balancer using a REST-like API.  The Job Queueing mechanism, by default, manages the queue on the local host.  I wanted the job server to manage its own queue.  This REST-like API will send the request to the load balancer, which sends it to a host.  In that REST-like call is contained the serialized object of the job that needs to be executed, along with an dependent data/references to data.  That host then queues the job on itself and then returns a serialized PHP object that provides the host name and the job number.  This result object can then be attached to a session so you can directly query the job queue server on subsequent requests.

The code for the manager is as follows.


Sequence of Events

sendJobQueueRequest() is the first to be called.  The job is passed via a parameter and is subsequently serialized.  A connection is made to the URL, which is stored in a Zend_Config object.  That URL can be a local host name or the load balancer’s host name.  Using this you can also set up different pools of servers quite easily simply by creating multiple load balancers and have each pool managed based off of its individual resource needs.

sendJobQueueRequest() called on the front end will cause createJob() to be called on the back end.  This queues the job locally by specifying a LOCAL URL that will be responsible for executing the job and creates a response object which contains the unique hostname of the machine and the unique job number on that machine.  It is serialized and echoed.  sendJobQueueRequest() then reads the response and unserializes it into a Response object which can be attached to a session.

This is the code on the backend URL that will be executed to queue the job.


Don’t worry about the bootstrap.php yet.  It simply contains some configuration mechanisms and instantiates the SimpleCloud adapters.  We’ll cover that later.

This is the code for the response object (created in createJob()). The front end machine can call getCompletedJob() and pass the response object to check and see if the job is done.


At some point in the future, as resources are available, the URL, noted by Zend_Registry::get(self::CONFIG_NAME)->executeurl in createJob() will be executed.  The code of that URL is


Pretty simple, eh?  That’s because most of the magic happens in the Manager class.  This is when executeJob() is called.  It takes that serialized object, unserializes it, and executes the run() method.  We will look at the difference between execute() and run() in a subsequent post.  If the job executes fine, the job is re-serialized and echoed.  If there is an exception thrown, THAT is serialized.

That’s the manager.  Next we will look at the abstract job class and after that we will get into the SimpleCloud components.

SimpleCloud Part 3 – The Abstract Job

Leave a comment

We have so far looked at setting the stage and managing the job.  How about executing the job itself?  The job we will look at here will be relatively generic.  I will get into more detail after I have talked about the SimpleCloud elements.  This, here, is simply to show you the theory behind how jobs are executed.

The abstract class is pretty simple.


There are only three methods.  The first is _execute().  This method needs to be overridden.  It is the code that will be executed on the remote server.  And because it will be serialized and executed on the remote host, the code for your job class will need to be deployed there.  You could actually send the source code for the class along with the serialized version and make the backend COMPLETELY stupid, but I would think that anyone remotely security minded could see the problem with that.

To implement a new job, do something like this:


Then to send the job to the queue call:


The execute() method is called on the front end.  But it doesn’t really execute.  It calls the queue manager and queues it on the backend servers.

Then on the backend servers (remember the executeJob() method?) the run() method is called, which actually calls the _execute() method, which contains the logic.  And while I didn’t show it here, because this job is re-serialized after execution you can store status information any other data attached to the object in that object and, once it’s unserialized on the front end after calling getCompletedJob() on the job manager.  If the job is completed it will return the unserialized instance of, in this case, orgeschradejobsSendEmail as it existed at the end of its run.

Now, to get to the SimpleCloud portion of this series; Storage.  The link for part 4, discussion storage, is in the related stuff section.

SimpleCloud Part 4 – Storage

Leave a comment

Now that we’ve gotten some job processing code done, let’s get into the good stuff.  The first thing we’re going to look at is the storage mechanism in SimpleCloud.  The example we used was uploading an image to the server so it could be resized for viewing at multiple resolutions or junk.  Now, you could simply attach the file contents to the job class, serialize it and unserialize it on the other side.  But the Job Queue server is really not designed for that (nor are most other queueing applications).  So what we’re going to do is use the Storage mechanism in SimpleCloud (in this case, S3) to store the files temporarily and then for the resized versions.

The first thing we need to do is create the adapter.  I am simply putting it into a Zend_Registry object for later retrieval.  It, along with the Document and Queue adapters, are created in the bootstrap file.  The bootstrap file loads the autoloader, creates the config objects and then creates all of the cloud adapters.


The most important line is the getAdapter() line.  That takes the configuration options and creates an adapter based on those options.  It’s really quite simple.  In this case I’m using the S3 adapter.

A bucket name needs to be specified, and I believe it needs to be created ahead of time.  This allows you to separate your applications but still use the same account keys.  Easy, huh?  You haven’t even tried using it yet!  Here is the job (distilled to the essentials; full version will be downloadable) that is used to process the images.


The parts pertaining to the document adapter have been bolded.  The point here is that the storage and retrieval of file data is pretty much transparent.  Store/Fetch.  Integrating between the front and back end is pretty easy, too.


So, what is going on here?  I’ve bolded the most important parts.  When we call setSourceFile() this calls the code that uploads the file to S3.  Additionally, IIRC, there is also a stream API where you can pass a file resource and it uses that instead of the simple file contents.  That’s very useful for storing large files.  But remember in the earlier post where I said that calling execute() doesn’t actually execute it, but queues it, and that the result is a response object that provides the job number and the server host name?  There you see it getting attached the the session.  This code then forwards to another page, which we will look at in a bit.

But, as you can see, using SimpleCloud to upload files to a storage service is stupid easy when using Zend Framework.

Not another cloud article

Leave a comment
Where is the cloud?
Where is the cloud?

Are you hype averse?  Start reading at paragraph number 4.  This is paragraph number 0.

How does one write an article on "the Cloud" without sounding all buzz-wordy? Can it really be true that a technology has lived up to its hype? Well, no, of course not.  But I've been wrapping my head around cloud computing for a little while now, trying to understand what the actual benefits are.  I found some good numbers on the cost savings of cloud computing. They seem to stack up. Even if they're off by a factor of two the cost savings are still very good and the ability to ramp up production machines in a few minutes is really good. I know of several customers who have massive peaks at predictable times and so maintaining a full set up hardware to handle those couple of hours (maybe days) a year would be silly.  Even customers who have known spikes at 7:00pm that last for 78 minutes may not need to have as many machines running throughout the day.

Cloud computing can also help to enforce good standards.  Josh Holmes showed me the Azure interface last week at TEK-X and, I may be wrong about this, but it looks like it forces you to have a staging environment.  That's just plain awesome. The whole package deployment mechanism is also pretty slick (another place where there is not a strong story for PHP).

This summer I am going to be spending a fair amount of time researching the cloud and trying to find ways that can make the cloud practical for you. Simplecloud is a good start for that. But I'm not quite there yet because I don't think that we have quite touched on where the cloud benefits the PHP developer yet.

In my pondering I have tried to think of what the benefits are to the PHP developers on the ground floor. Most of the reasons that I've found when Googling "Benefits of Cloud Computing" are all the same thing.  Lower maintenance costs, it's automated, etc. blah, blah, blah. Most of the benefits are from operational costs being lowered.

However, I think that for the PHP developer there is another, greater, benefit that has not been documented.  Or perhaps it has and I've just missed it. I think that the benefit to the PHP developer is a forced change of mindset.

A PHP developer thinks in terms of request/response, request/response, request/response. Is there an application that is truly request/response? Well, yes, of course there is. In fact, many of them are. However, I have found that the application architecture that many PHP developers subscribe to is often a weak link in the chain when it comes to scaling logic. That weak link is the strong relationship between PHP applications and relational databases while packing as much logic into a request as possible. And the solution is often "Let's cache!!!"

Don't get me wrong.  I am by no means dissing the PHP-SQL relationship. And I also don't think that there is anything wrong with relational databases. However, PHP started by being an interface with an RDBMS for doing simple and that mindset is still very strong among PHP developers except that the queries are more complex (or are they more simple now?  I dunno.  Thanks Active Record :-)). And perhaps it's time to start re-examining how we build our applications as we move more and more into the critical application space.

But what does this have to do with cloud computing? When you look at the nature of the cloud it builds upon two concepts, one of which has been in use in the more "enterprise-y" languages for a while, another which has not.  The one that has not has been the concept of a document storage engine instead of an RDBMS. Most of the cloud services I have looked at have their foundational data access layer being some kind of loosely coupled or completely decoupled storage mechanism. Many organizations grissle at the thought of having de-normalized data, and understandably so.  But at the same time, how many applications truly require fully normalized data?  From the PHP perspective, that number is smaller than what is often believed, in my opinion. To PHP developers' credit, there has been interest in storage engines like MongoDB, which are also part of the picture, but people are still kicking the tires, so to speak (as they should be).

The other part that PHP developers don't often work with is that of messaging and queuing. I have talked with several Microsoft, Adobe and IBM developers and the conversation goes like this.

Me: What is it that you think PHP could do better?
Them: Static typing, threading (and other assorted bull)
Me: What about messaging/queuing?
Them: YES!!! Absolutely!

You may have noticed that I have several articles about the Zend Server Job Queue on this blog. There's a reason for that. Too many developers think along the lines of "What can I all get done in this individual request/response?" Integral to most, if not all, cloud computing environments is the integration of messaging and queuing.  This is also key to building a large scale application. Not necessarily a "high-performance" application, but an application with loads of business logic, loads of dependencies and loads of data. When people try to build complex applications in PHP they often end up calling a Services organization because they run into serious performance issues so someone can "tune" their PHP stack to make it faster (good luck with that) or finding out why it is that their application does not have acceptable response times. Messaging and queuing is key to making this problem go away and both are features that you will find in cloud stacks.

Cloud all of this be done outside of the cloud?  Sure, but my experimantation this afternoon cost Zend 25 cents. It's hard to argue against that. And since virtually all environments have cyclical traffic, having all of your servers turned on all of the time does seem like a waste. I remember working for a company where we could have cycled through 3-10 servers over the course of a day based off of load, with the option to turn on a few more if a spike occurred.  It is nearly impossible to accurately predict load; you can only "mostly guess".

That's it for now.  I will be taking the next while to examine different aspects of cloud computing and looking at the various vendors so you can start Lando coding.

Also, if you have any comments or thoughts, feel free to post them in the comments section or contact me on Twitter or something.