Realtime logging for Magento

4 Comments

There are a couple of things that animate me in PHP/Magento world.  The first is queuing.  I’ve written and spoken about that several times and do as much with it as possible.  The second is something that I’ve not written about nor done a lot of work in.

One of the other things that I have had in the back of my mind for the past several years is a cloud-based, completely integrated development environment.  Personally, I think that the development tooling in the PHP realm are still very young and basically come down to an IDE.  Debuggers are still not used as much as they should be.  var_dump() is our friend, right?

Ever since the Zend Developer Cloud started up it stirred in me some really interesting possibilities of what could be done.  Sadly they never happened, but the ideas remained.  If that doesn’t inspire you it’s because I’m not describing what I have in my head.  I don’t have the time to do that.

But this blog post is one part of it.  One of the things that is part of what I envision is a realtime logger that shows what is happening, as it is happening.  There is good software out there that shows what happened during a request, but it is always after the fact.  The Zend Server ZRay feature is a good example.  It’s a cool feature but doesn’t quite go as far as I have in mind.

With that in mind I spent some time last week working on a very small piece of this vision for Magento.  I wrote it really, really quickly and so don’t laugh when you look at the code and see obvious errors.  One error is the mechanism that I intended to use to shutdown logging when the watcher is not running, but sadly the pcntl code does not seem to do what I expect it to, which is odd because I copied and pasted the code from the documentation.

It is a combination of Magento extension and command line program that utilizes Redis PubSub for messaging passing.  It

  • Hooks in to every Magento event and provides instrumentation for that
  • Contains a log writer so that if an exception is logged or some message is logged it will pass it through to the watcher (no running through var/report again)
  • Contains an SQL profiler that will provide any SQL queries in real time or a summary at the end of the request.

The logger in Magento is controlled by the watcher, which sends logging commands when it is started up.  Following is the help menu.

[[email protected] bin]$ php watcher.php --help
This script is used to connect to the front end of the Eschrade_PubSubLogger utility.
Note that if you are using the File cache backend you may need to clear it from /tmp if the user you are running this script under does not have permission in {magento}/var/cache.
Usage: watcher.php [ options ]
--all-events Listen for all events
--enable-blocks Enable block events (overrides all-events)
--enable-models Enable model events (overrides all-events)
--events <string> Comma-separated list of events to listen for (overides any settings)
--resolve-classes Look up the class based off of group class names
--hide-event-data Hides the event data when events are turned on
--hide-event-observers Hides the event observers when events are turned on
--delay-messages <integer> Delay messages by X milliseconds (default 75ms) - Makes it easier to follow the log
--no-color Turns off color coding of elapsed times
--show-request-id Show the unique ID for each request
--show-sql [ <string> ] Show the SQL in real time. Optionally add a preg_match() filter
--show-sql-summary Show the SQL summary at the end of the request
--help This help page

By default it just captures the URL, the request start time, the full action name and the elapsed time of the request.  Following are explanations for each option

  • all-events – Turns on event monitoring.  It will provide the event name all the observers for that event and a summary of the data provided to the observers.  By default it turns off block and model events since they end up being noise most of the time.  This will only attach to configured events, not all events.  In other words, if an event has a configuration value in a config.xml it will be reported on.  If there is some event that is triggered somewhere it will not show up.
  • enable-blocks – Turns on block events
  • enable-models – Turns on model events
  • events – provide a comma separated list of the specific events you want to watch for.
  • resolve-classes – when examining event observers this flag will have it look up the class definition from the Magento config
  • hide-event-data – Turns off the event data.  For example, if you are listening to controller_action_predispatch you will not see the controller action class that is passed to the observer
  • hide-event-observers – Turns off printing the observers.  When done in conjunction with hide-event-data this will simply print out a list of all the observers as they are called with their elapsed execution time from the last event
  • delay-messages – One of the things I noticed was that a lot of information comes in very quickly.  So I added a 75ms delay to each message.  Set to zero if you want no delay.
  • no-color – If you are running it with event introspection it will color code elapsed time in events.  However those commands look like gibberish if you are piping to a file.  This turns that off
  • show-request-id – The Magento extension automatically adds a unique ID to each request so you can track the source of logs or events.  However, the watcher does not turn this on by default.  This will turn it on.
  • show-sql – Will show the SQL and its elapsed execution time in realtime as it happens.
  • show-sql-summary – Shows a count of SQL operations, SELECT, INSERT, etc. at the end of a request.

Download it at github.


assertTrue( ): Amazon: The Everything Company

Leave a comment

If you’re thinking about putting your company’s Web presence on Amazon’s computers (using EC2), you might want to ask yourself a few questions. Does Amazon already compete with your business in any way? How long before Amazon does compete with you? Do you want to put your online business in the hands of a potential competitor? Do you want Amazon to know more about your business than it already does? No one’s suggesting Amazon is actually going to spy on your business’s bits and bytes (which are already encrypted anyway, right?), but they can learn a lot simply by knowing your capacity needs, your business’s Web traffic patterns, your scale-out and failover strategies. Just by metering your Web usage, they know too much.

via assertTrue( ): Amazon: The Everything Company.


Magento, ESI, Varnish and performance

2 Comments

I have been doing a little playing with Magento over the past couple of days.  I’ve been helping out Ebay/Magento by delivering some of their performance training over the past few months.  I’m by no means the world’s best Magento person at the moment, but I know the architecture pretty well.

One of the things I’ve wanted to do is play more with Varnish.  There’s lots of hype, and a lot of the hype is true.  It really is as fast as they say it is and worth looking at for a full-page caching solution.

But what about when you “can’t” do full page caching?  Enter Magento (or any ecommerce platform).  Most of the time you pages are fully cacheable.  Right up until you click the “Add To Cart” button.  At that point full page caching doesn’t work and so the default behavior for most platforms is to simply not cache output at that point.

But the problem is that you still have 90% of the page (or more) is still cacheable.  So you are doing full execution on a page that has a bare minimum of actual dynamic content.

This is where ESI comes in.  ESI, or Edge Side Includes, allow servers on the edge of a CDN do full page caching but do dynamic callbacks to your website to fill in certain parts of the page.  You lose a lot of the benefit of having a full page cache in that the many order of magnitude performance improvements you gain with full page caching is reduced to how fast you can get the backend dynamic content.

But who says that ESI only needs to be on the edge?  I decided to take a look at how I might implement ESI, with partially dynamic content in Magento, using Varnish as a processor.

My results are in and they look pretty good, though they are preliminary.  As I write this blog post I’m dealing with some wildly erratic response times from Varnish.  Varnish as a full page cache is consistent.  The ESI pages are consistent.  But put them together and I have response times that vary by two orders of magnitude when I do a benchmark that uses concurrent connections.

But if I do the benchmark with sequential HTTP calls you can definitely see an improvement.

First the results without a full page cache.  This is the way you do it by default now once someone adds something to their shopping cart.

Concurrency Level: 1
Time taken for tests: 1.171843 seconds
Complete requests: 10
Failed requests: 0
Write errors: 0
Total transferred: 81030 bytes
HTML transferred: 76210 bytes
Requests per second: 8.53 [#/sec] (mean)
Time per request: 117.184 [ms] (mean)

Now the results when using Varnish with ESI.

Concurrency Level: 1
Time taken for tests: 0.591032 seconds
Complete requests: 10
Failed requests: 0
Write errors: 0
Total transferred: 86500 bytes
HTML transferred: 81000 bytes
Requests per second: 16.92 [#/sec] (mean)
Time per request: 59.103 [ms] (mean)

That is roughly a 50% improvement with no loss of functionality.

In this example I simply removed the call to the Magento sidebar which renders the shopping cart and replaced it with an ESI tag.

<esi:include src=”http://magento.loc/eschrade/esi/sidebar” />

That routes to a controller where I manually spit out the sidebar contents.

Now, you might be thinking “OK, that’s one HTTP call.  What happens if you have 20 ESI calls?  Won’t your server be overloaded?”  Well, perhaps (though not definitely).  But what you need to remember is that performance and scalability are two different things.  Yes, you could overload that ONE server with 20 ESI calls.  But what if you have 20 servers running behind a load balancer?  Yes, you will be using much more by way of server resources.  But if your response time from each of those backend servers is 50 ms you still only have a total response time of 50ms, give or take a few nanoseconds to render the page.  So while CPU time will be greatly increased across your cluster, wall time will be greatly reduced because the processing is being done asynchronously with Varnish aggregating the results.

This, of course, assumes that Varnish does its ESI processing asynchronously, which I will confirm.  (I’d be surprised if it were synchronous)

[UPDATE]

As noted in the first comment before, it does look like Varnish processes ESI synchronously, meaning one at a time.  So these numbers would seem to hold if you only have one ESI call to make, but they drop off significantly as you add ESI calls.  So I will be looking at different options since the whole premise depends on asynchronous ESI.

Crap.

[/UPDATE]

But like I said, these are preliminary results.  I will do a fuller blog post on the subject when I have the inconsistent performance issue worked out.

 


Using Mongo with PHPCloud

3 Comments

With the release that occurred on March 4th we have added Mongo support to the DevCloud.  So connecting to a Mongo instance is extremely easy to do now.  We don’t provide access to a Mongo collection but you do have the tools to connect to various Mongo service providers.  For this example I used MongoHQ though any service running Mongo sound work.

First go to whichever service you are using and create a new collection.  Like this

Then click on the Zend Server link for your container.  Like this

Go to Server Setup / Extensions and enable Mongo.  Like this

Restart Zend Server and deploy code.  Like this

1
2
3
4
5
6
7
8
9
10
11
$db = new Mongo('mongodb://kschroeder:[email protected]:10050/kschroeder');
 
$collection = $db->kschroeder->views;
 
$collection->insert(
array(
'view' => new MongoDate()
)
);
 
echo $collection->count() . ' views on this page';

Now you’re doing Mongo on the DevCloud like a boss.

It took me all of 10 minutes to do this.


Using Jabber/XMPP for event notification with Zend Server and/or PHPCloud

Leave a comment

I have removed this service due to possible abuse.  It was intended to be a test and I think I’ve gotten what I need

 

There was something I’ve wanted to try for a little while.  Messaging is something that’s interested me and with all of the work I’ve been doing on the Zend Developer Cloud I’ve been interested in seeing how to make things communicate a little better.  One of the ideas I’ve wanted to try was to use instant messaging for notification when  slow request or large memory usage events occur.

So to test this out I decided to write a handler that would take events and pass them off to a Jabber server.  That way when the events occur they’ll be passed to your IM client instead of flooding your email client.

It took a little longer than I wanted, mostly due to the fact that the XMPP protocol is new to me, many of the examples don’t work and it required setting up a server that I was not familiar with.  I ended up using the XMPPHP library and the Openfire server to do my dirty work.  I used PHPCloud as dev environment.

So what I’m going to do is open it up to anyone who wants to try it.  I make no guarantees as to its usefulness or stability but I figured that it would be cool to try.

First of all, you will need an XMPP/Jabber IM client.  I used Trillian and it worked fine.

Next, sign up with this form

<FORM REMOVED>

This will provide an email address for you that you will copy and past into your instance of Zend Server.  Log in to your instance of Zend Server, or click on the link if you are in the DevCloud and go to Server Setup -> Monitor. For the mail server, enter in eschrade.com (because that’s where the notification messages will go), set whatever you want for the Sender Email Address and change the Monitor Rules Default Emails setting to the email address provided in the form above.

Now go to the Monitor settings under Rule Management.

Select all of the rules, or just the ones you want notifications for, select Email Emailing Action and click Change.  You may need to restart Zend Server.

After that you need to take your XMPP/Jabber client and connect it to my server.  Your credentials will be your [email protected], with the password you specified.

Now the next time an event occurs on your Zend Server or DevCloud system you should get a message in your IM like this.

Note that this is NOT an officially supported Zend service, I am just trying this out and wanted to offer it to others to see if they wanted to try it out too.

 

If you want to see how to set it up, here is a YouTube video of how it works.


URL rewriting in PHPCloud

1 Comment

Found out today that the typical rewrite rule

1
2
3
4
5
6
7
RewriteEngine On
 
RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^.*$ - [NC,L]
RewriteRule ^.*$ index.php [NC,L]

doesn’t work too well in the DevCloud.  I would get log errors like

[Tue Feb 07 13:06:33 2012] [error] [client 192.168.0.254] Request exceeded the limit of 10 internal redirects due to probable configuration error. Use ‘LimitInternalRecursion’ to increase the limit if necessary. Use ‘LogLevel debug’ to get a backtrace.

Turns out that the fix is really simple.

1
2
3
4
5
6
7
8
9
Options +FollowSymlinks
 
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteBase /
RewriteRule ^.*$ - [NC,L]
RewriteRule ^.*$ index.php [NC,L]

Adding RewriteBase fixes the problem.


Looking for DevCloud stories

Leave a comment

The Zend DevCloud was released at ZendCon as a technology preview.  I do a lot of my development work on the DevCloud now and I have my stories.  But I’m interested in hearing yours.  If you are using DevCloud for serious development work I would like to hear about it.  We would like to put together some examples of some cool work that people are doing on it.  So if you’d like a chance at getting some featured coverage on the DevCloud, please email me at [email protected] (no, I don’t care about spam any more) with some details.

If you do not have an invitation key, feel free to ping me at the same address and I’ll get you set up.


PHP Cloud and education

Leave a comment

One of the problems that anyone who has taught a programming class will know is that the worst part of the class isn’t bored students, inane questions or poor material, but, rather, getting started.  I delivered Zend’s training for 3 years (myself among several others) and I can say, without reservation, that one of the most consistent impediments to the classes were getting things up and running.

Getting students to install stuff on their local desktops was often problematic.  They might have some conflicting software that would need to be removed (with the corresponding pulling of teeth).  There might be restrictions of what they could and could not install.  Or they might not have had the skills to set up a working environment on their local machine.  So we basically had to support any kind of scenario

To get around that issue we started creating VMWare instances.  Lots of them.  And it ain’t easy either.  You have to keep all of the OS instances up to date along with the IDE.  And once you’re done you need to zero out the free space to maximize compression to keep the VM’s from becoming quite large.

In short, managing coursework is not necessarily as straightforward as one might think.

If you’re in this predicament you might actually want to take a look at phpcloud.com. With the DevCloud what you can do is set up a container that has all of the example code, scripts and anything PHP in nature and take a snapshot of it.  That snapshot includes things like database content.  Students can then take that snapshot and create their own container out of it, without having to install new software, manage an environment that they may not be familiar with or do tasks that they don’t have the skills for.  You get the full PHP environment without any of the hassle plus some additional goodies for free that you’d normally have to pay for (like Zend Server).

So, if you’re doing any kind of training or class work and don’t like wasting time helping people get an environment up and running on a machine that you’re not familiar with, check out phpcloud.com.  It’s currently in a technology preview state but it’s pretty robust.  And the cost?  It’s free.  And we expect there to be a free tier as far into the future as we can see.  Check it out.

Here’s a video where I show how you can use snapshots.


High Scalability – Is it time to get rid of the Linux OS model in the cloud?

Leave a comment

High Scalability – High Scalability – Is it time to get rid of the Linux OS model in the cloud?.

I’ve sometimes thought about where we are and where we come from in terms of programming.  Programmers have this habit of falling for new and shiny things.  So I read this article with some interest as they were asking a couple of interesting things at the end.  Namely,

 Do we need a device driver layer? Do we need processes? Do we need virtual memory? Do we need a different security model? Do we need a kernel? Do we need libraries? Do we need installable packages? Do we need buffer management? Do we need file descriptors? 

My answer, in short, would be a big “yes”… for the time being.  Legacy may be a bad word in some circles but it is simply where we came from.  We could not have gotten here without going the way we came.  Do we need a device driver layer, virtual memory, kernels, etc?

Yeah, pretty much.  Retooling is a massively, mindblowingly expensive operation if we were to remove those.  So, in the short term (months-years) and medium term (years to decade), I don’t think you’ll see a lot of that stuff go away.

But that doesn’t mean we shouldn’t look at ways around it.  I read an interesting article a while ago that was talking about how there is not a single current programming language that will be able to run on the processors of 10-20 years because there is no language that can handle the level of concurrency that we will need to deal with.  Hopefully I’ll have retired a rich man by then so I don’t have to learn how to do it.  But at some point, our systems are going to be run differently.

I think that this is a great opportunity for the cloud.  You can actually ramp up new ways of thinking quite easily in the cloud.  I believe this to be true (right now, at least)  because you can easily provision and de-provision systems that are completely separate from your end users.  What someone could do is build an AMI (as an example) that implements this new shiny way of doing things and it would not have to be wide spread.

Actually, I think that one of the biggest revolutions in the software industry has been the API revolution.  What that means is that you can build part of your infrastructure on one technology while leaving the rest on another.  You don’t need to commit to a technology.  Given that we really don’t know the world will look in 20 years, that is probably a good thing.