Testing GlusterFS for Magento

nfs-bad

I am not a fan of NFS for production information.  NFS is great for aggregating data from across multiple different machines, storing deployment files and other such administrative things.  Serving static content?  No.  I haven’t blogged about it but I have talked about it several times at conferences.  NFS, as a static content distribution mechanism is horribly slow.  Here’s the chart to prove it.

nfs-bad

Never mind the issues that you need to take into consideration.

I know a lot of Magento deployments use NFS so they can have static content accessible from multiple different servers and have it all “up to date”.  But, like I said, I am not a fan of using NFS for production and so while it works, in my opinion, it is not optimal.

So I wanted to test GlusterFS to see provided a more optimal solution.  So I wrote a quick test script to try it out.  I tried creating multiple files, reading from one file multiple times, reading from the sequence of files multiple times and then deleting the files.  The Gluster configuration is vanilla, as is NFS.  Here’s the code I used to test.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
$fs = array('/kschroeder/gfstest', '/mnt/gfstest', '/var/tmp/gfstest');
 
foreach ($fs as $f) {
  @mkdir($f, 0755, true);
  $startTime = microtime(true);
  for ($i = 0; $i < 500; $i++) {
    file_put_contents($f . '/test-' . $i, str_pad('', 0, 10240));
  }
  $elapsed = microtime(true) - $startTime;
  echo "$f write $elapsed\n";
 
  $startTime = microtime(true);
  $files = glob("$f/*");
  $file = array_shift($files);
  for ($i = 0; $i < 10000; $i++) {
    file_get_contents($file);
  }
  $elapsed = microtime(true) - $startTime; 
  echo "$f read-one $elapsed\n"; 
 
  $startTime = microtime(true);
  foreach (glob("$f/*") as $file) {
    file_get_contents($file);
  }
  $elapsed = microtime(true) - $startTime; 
  echo "$f read-all $elapsed\n";
 
  $startTime = microtime(true);
  foreach (glob("$f/*") as $file) {
    unlink($file);
  }
  $elapsed = microtime(true) - $startTime;
  echo "$f delete $elapsed\n";
 
}

/kschroeder/gfstest was an NFS mount, /mnt/gfstest was a GlusterFS mount and /var/tmp/gfstest was local.

Here are the results.

not good enough

This is the elapsed time to conduct the individual tests so lower is better.  GlusterFS did better than NFS on all the different operations.  However, when compared with local speeds there was no comparison.

GlusterFS is definitely better, IMHO, than NFS.  It does a lot of things like clustering, striping and replication out of the box and is very easy to configure all of those options without having to worry about magic or other components to make it work.  For that reason I would definitely put it on my list of preferred solutions for handling static web content.

But what I want is a distributed, replicated static file storage mechanism that does client-side caching so my static file read performance is at least almost as good as local.  And if my main file server goes down I don’t want my entire web site to go down.  And it needs to be easy to install and manage so that merchants, who often do not have a lot of system administration experience, are able to do basic SA tasks.  As much of an improvement that GlusterFS is, out of the box it doesn’t seem to be solve the problem fully.

The search continues (and I do have some other possibilities I want to test).

 

What I would love in PHP-FPM

I love most things about PHP, but what I don’t like is that in order for me to do any kind of asynchronous processing I need to create an infrastructure.  In other words, I need to build a queuing daemon or build some kind of interface.

It really shouldn’t be that much work for what is a simple task in many other languages.

So it would be really cool if PHP-FPM had a FIFO/delayed queue where you could inject a FastCGI request into the queue and do either fire and forget or allow the executing process to wait on a queue selector.  So it would look kind of like this

1
2
3
4
5
$j1 = new FpmRequest('/some/url', 'POST', array('var' => 1, 'var2' => 2));
 
$q = new FpmQueue();
$q->addJob($j1);
$q->execute();

or if you want to wait for the response, this

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$j1 = new FpmRequest('/some/url', 'POST', array('var' => 1, 'var2' => 2));
$j2 = new FpmRequest('/some2/url2', 'POST', array('var' => 1, 'var2' => 2));
$j3 = new FpmRequest('/some3/url3', 'POST', array('var' => 1, 'var2' => 2));
$q = new FpmQueue();
$q->addJob($j1);
$q->addJob($j2);
$q->addJob($j3);
 
$q->execute();
$q->wait();
 
echo $j1->getOutput();
echo $j2->getOutput();
echo $j2->getOutput();

It would be nice for the Apache SAPI to do this as well, so I could debug the requests easier (I use Zend Server which, ATM, only supports Apache).  But it would seem that PHP-FPM would have an easier time of doing this because it manages its own resources it could do things like maintain a separate pool reserved for queued requests.

Maybe I have unique use cases or I just like making things more complicated for myself.  But it would be really nice to have some kind of queuing work out of the box.

EAV Properties for Magento

The only thing I hate more than bad code completion is no code completion.  When working with PHP arrays for configuration there are often options that you need to remember to properly configure the object, factory or whatever, that you are using.

What I really like to do for my own code is that when I do have some kind of array-based configuration I like to have class constants that define what options there are for me to set.  Because they are constants they cut down on fat-finger problems and also take out a lot of the guesswork.

Magento is pretty good about doing this, but one place where I have not found much by way of constants is in the EAV system.  There are a number of EAV properties that you can set.  Here is a table of the ones I found.

Config KeyMerged KeyDefault Value
backendbackend_model
typebackend_typevarchar
tablebackend_table
frontendfrontend_model
inputfrontend_inputtext
labelfrontend_label
frontend_classfrontend_class
sourcesource_model
requiredis_required1
user_definedis_user_defined0
defaultdefault_value
uniqueis_unique
notenote
globalis_global1 (Mage_Catalog_Model_Resource_Eav_Attribute:: SCOPE_GLOBAL)
input_rendererfrontend_input_renderer
visibleis_visible1
searchableis_searchable0
filterableis_filterable0
comparableis_comparable0
visible_on_frontis_visible_on_front0
wysiwyg_enabledis_wysiwyg_enabled0
is_html_allowed_on_frontis_html_allowed_on_front0
visible_in_advanced_searchis_visible_in_advanced_search0
filterable_in_searchis_filterable_in_search0
used_in_product_listingused_in_product_listing0
used_for_sort_byused_for_sort_by0
apply_toapply_to0
positionposition0
is_configurableis_configurable1
used_for_promo_rulesis_used_for_promo_rules0

Kind of a lot.  And as you get more experience you will memorize them.  But what a pain in the butt.

So I created a new GitHub repository that allows you to simply use some class constants instead of hand-typed array keys for EAV configuration.  Most of the code I have seen for EAV configuration uses inputted key values instead of constants and my Google searches have not yielded much.  So my presumption, possibly incorrect, is that this kind of helper class does not exist.

Using the class is very, very simple.  You only need to put it in code from GitHub in code/local/Eschrade/Util.  The autoloader should take care of the rest.  No need to manage a config file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$installer = Mage::getResourceModel('catalog/setup','default_setup');
 
$config = array(
	Eschrade_Util_Statics::EAV_TYPE_ALIAS 	=> Eschrade_Util_Statics::EAV_TYPE_TEXT,
	Eschrade_Util_Statics::EAV_INPUT_ALIAS	=> Eschrade_Util_Statics::EAV_INPUT_BOOLEAN,
	Eschrade_Util_Statics::EAV_LABEL_ALIAS 	=> 'Some Attribute',
);
 
$installer->addAttribute(
	Mage_Catalog_Model_Product::ENTITY,
	'some_attribute',
	$config
);
 
$installer->updateAttribute(
	Mage_Catalog_Model_Product::ENTITY,
	'some_attribute',
	Eschrade_Util_Statics::EAV_LABEL,
	'Some Other Attribute'
);

Note that for the addAttribute() call I use the name  + _ALIAS.  So ‘label’ for adding an attribute is Eschrade_Util_Statics::EAV_LABEL_ALIAS but for updating an attribute it is Eschrade_Util_Statics::EAV_LABEL.

Also note that this class is not complete.  I will make additions to it as I have time and I hope that others who see holes make push requests to add more things.

Please, please fork it, make additions and fix errors you may see.  Personally, I would rather be able to choose from a list of options than having to manually type things in.  It makes things much cleaner and predictable.  So I hope that this will make things a little easier for you.

The First Annual Report on Programmer Ass-hattery

The 2013 Kevin Schroeder Report on Programmer Ass-hattery

Taking a cue from TIOBE, where you can take Google search results and make them mean anything that you want them to, I decided that I was going to try an experiment and see if I could discern, from Google search results, how likely a programmer for a given language would engage in ass-hattery.

In short, this is the most important programming index you will ever see.  And unlike many other indexes I will put my hastily concocted formula up front and center for your enjoyment and ridicule.

The formula is

(“Google search for ‘X rocks’ / ”Google search for ‘X sucks’) * (language score sum / X language score)

I got Google results for PHP, Ruby On Rails, Java, Python, C, C++, Objective-C, C# and JavaScript.  I took the language score from Lang Pop.  The language popularity values for each of the programming languages is

PHP Ruby On Rails Java Python C C++ Objective-C C# JavaScript
5591 667 6930 4240 9948 5584 352 429 1572

The sum of all of these is 35313.

To illustrate the formula the Python Programmer Ass-hattery score is 20.6387.  ”Python rocks” results are 11,300, “Python sucks” is 4560.  So the formula would look like this.  (11300 / 4560) * (35313 / 4240).  Generally, the higher the score, the more likely you will want them to wear there ass as a hat.

The theory is that the louder one talks compared to their actual popularity the more likely they will engage in ass-hattery.

The results for the First Annual Report on Programmer Ass-hattery are:

The 2013 Kevin Schroeder Report on Programmer Ass-hattery

It’s math so it can’t be wrong.

Comments proving ass-hattery will either be deleted or ridiculed.  This was meant as fun (unlike some other indexes).

For the last time, the file system is not slow!!

fs-benchmarks

Having started working at Magento I have been making myself more familiar with many of the different parts of the community.  I have spent a fair amount of time over the past several weeks trying to understand how people work with Magento and what their problems are.

One of the things that often comes up is speed.  And there are lists of things that people can do to try and make Magento faster.  But there’s something that bugs me in many of these lists.  Often people say that in order to make your Magento installation run faster you need to put certain things on tmpfs or RAM.

Makes sense, right?  The disk is slow, RAM is fast and so a RAM drive must be fast.  Right?

Nope.

Yes.  The disk is slow.  But that does not mean that the file system is.  The disk is only part of the file system.  The file system includes this nifty thing called a disk block cache.  The file system will cache often-used disk blocks in RAM.

What?  The file system uses RAM?  Yep.  When you type in “top” and you see that value for “cached”?  That is RAM that the operating system is using to store disk blocks.

The result of that caching is this chart.  It measures the throughput in number of requests per second of a static resource via Nginx over HTTP load tested from a remote host on the same network.

Server Throughput (HTTP Requests per Second)

fs-benchmarks
To conduct the test I did 3 test runs for each of the types of file systems.  A physical ext3 disk, tmpfs and a RAM drive.  I ran 10,000 iterations via 100 concurrent requests.  The results of the test show that the physical media was actually the fastest!

I was actually a little surprised by that.  I was expecting that the physical (local) file system would only be keeping up, instead it was faster.  I would be willing to chalk that win up to entropy in the system but the assertion that a RAM drive or tmpfs is faster than the file system is clearly not true.

Don’t get me wrong.  RAM drives are great if you want to explicitly define a file system which WILL stay in RAM.  However, I side with Linus Torvalds (when he was talking about O_DIRECT) that the purpose of an operating system is to manage a lot of this for you.  You might be able to get some better results for RAM or tmpfs from some tuning but it would seem to be a micro-optimization at best or a giant waste of time at worst.

No-.htaccess httpd.conf file for Magento

A couple of days ago I wrote a blog post on how why you should not use .htaccess files, or AllowOverride != All, on a production web server.  What you should do is place the .htaccess configuration information into your httpd.conf file instead.

So of course I was asked what that would look like.  So here it is.  I took all of the .htaccess settings, stripped some of the superfuous ones and removed the comments ( for clarity :-) ) and here is what you have.  Customize for your own site, of course.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
<VirtualHost *:80>
	ServerName magento.loc
	DocumentRoot /var/www/html
	DirectoryIndex index.php
 
	<Directory /var/www/html/var/>
		Order deny,allow
		Deny from all
	</Directory>
 
	<Directory /var/www/html/>
		AllowOverride None
		<IfModule mod_php5.c>
 
		    php_value memory_limit 128M
		    php_value max_execution_time 18000
 
		    php_flag magic_quotes_gpc off
		    php_flag session.auto_start off
 
		</IfModule>
 
		<IfModule mod_security.c>
		    SecFilterEngine Off
		    SecFilterScanPOST Off
		</IfModule>
 
		<IfModule mod_ssl.c>
		    SSLOptions StdEnvVars
		</IfModule>
		<IfModule mod_rewrite.c>
 
		    Options +FollowSymLinks
		    RewriteEngine on
 
		    #RewriteBase /magento/
		    RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
		    RewriteCond %{REQUEST_URI} !^/(media|skin|js)/
		    RewriteCond %{REQUEST_FILENAME} !-f
		    RewriteCond %{REQUEST_FILENAME} !-d
		    RewriteCond %{REQUEST_FILENAME} !-l
		    RewriteRule .* index.php [L]
 
		</IfModule>
 
		    AddDefaultCharset Off
		    #AddDefaultCharset UTF-8

		<IfModule mod_expires.c>
		    ExpiresDefault "access plus 1 year"
		</IfModule>
	    Order allow,deny
	    Allow from all
	</Directory>
 
	<Directory /var/www/html/includes/>
		Order deny,allow
		Deny from all
	</Directory>
 
	<Directory /var/www/html/errors/>
		<FilesMatch "\.(xml|phtml)$">
		    Deny from all
		</FilesMatch>
	</Directory>
 
	<Directory /var/www/html/pkginfo/>
		Order deny,allow
		Deny from all
	</Directory>
 
	<Directory /var/www/html/app/>
		Order deny,allow
		Deny from all
	</Directory>
 
	<Directory /var/www/html/lib/>
		Order deny,allow
		Deny from all
	</Directory>
 
	<Directory /var/www/html/downloader/>
		<IfModule mod_deflate.c>
 
		    RemoveOutputFilter DEFLATE
		    RemoveOutputFilter GZIP
 
		</IfModule>
 
		<Files ~ "\.(cfg|ini|xml)$">
		    order allow,deny
		    deny from all
		</Files>
	</Directory>
 
	<Directory /var/www/html/downloader/template/>
		Order deny,allow
		Deny from all
	</Directory>
 
	<Directory /var/www/html/media/>
		Options All -Indexes
		<IfModule mod_php5.c>
			php_flag engine 0
		</IfModule>
 
		AddHandler cgi-script .php .pl .py .jsp .asp .htm .shtml .sh .cgi
		Options -ExecCGI
 
		<IfModule mod_rewrite.c>
		    Options +FollowSymLinks
		    RewriteEngine on
		    RewriteCond %{REQUEST_FILENAME} !-f
		    RewriteRule .* ../get.php [L]
		</IfModule>
	</Directory>
 
	<Directory /var/www/html/media/customer/>
		Order deny,allow
		Deny from all
	</Directory>
 
	<Directory /var/www/html/media/downloadable/>
		Order deny,allow
		Deny from all
	</Directory>
 
</VirtualHost>

Why you should not use .htaccess (AllowOverride All) in production

Commonly known as .htaccess, AllowOverride is a neat little feature that allows you to tweak the server’s behavior without modifying the configuration file or restarting the server.  Personally, I think this is great for development purposes.  It allows you to quickly test various server configurations without needing to mess with restarting the server.  It helps you be more (buzzword alert!) agile.

Beyond the obvious security problems of allowing configuration modifications in a public document root there is also a performance impact.  What happens with AllowOverride is that Apache will do an open() call on each parent directory from the requested file onward.

To demonstrate this I used a program called strace which checks for system calls and gives you a list of each system call that is made.

First we’ll take a look at the strace with AllowOverride set to None.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
semop(1638426, {{0, -1, SEM_UNDO}}, 1) = 0
epoll_wait(42, {{EPOLLIN, {u32=3507213864, u64=139813282633256}}}, 2, 10000) = 1
accept4(4, {sa_family=AF_INET6, sin6_port=htons(55755), inet_pton(AF_INET6, "::ffff:192.168.0.212", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28], SOCK_CLOEXEC) = 43
semop(1638426, {{0, 1, SEM_UNDO}}, 1) = 0
getsockname(43, {sa_family=AF_INET6, sin6_port=htons(80), inet_pton(AF_INET6, "::ffff:192.168.0.212", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
fcntl(43, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(43, F_SETFL, O_RDWR|O_NONBLOCK) = 0
read(43, "GET /test.txt HTTP/1.0\r\nHost: ma"..., 8000) = 87
gettimeofday({1361542861, 683952}, NULL) = 0
stat("/var/www/magento.loc/test.txt", {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
open("/var/www/magento.loc/test.txt", O_RDONLY|O_CLOEXEC) = 44
fcntl(44, F_GETFD) = 0x1 (flags FD_CLOEXEC)
fcntl(44, F_SETFD, FD_CLOEXEC) = 0
mmap(NULL, 12, PROT_READ, MAP_SHARED, 44, 0) = 0x7f28cfc74000
writev(43, [{"HTTP/1.1 200 OK\r\nDate: Fri, 22 F"..., 267}, {"hello world\n", 12}], 2) = 279
munmap(0x7f28cfc74000, 12) = 0
write(12, "192.168.0.212 - - [22/Feb/2013:0"..., 79) = 79
shutdown(43, 1 /* send */) = 0
poll([{fd=43, events=POLLIN}], 1, 2000) = 1 ([{fd=43, revents=POLLIN|POLLHUP}])
read(43, "", 512) = 0
close(43) = 0
read(6, 0x7fff79e2d7cf, 1) = -1 EAGAIN (Resource temporarily unavailable)
close(44) = 0
semop(1638426, {{0, -1, SEM_UNDO}}, 1

Now let’s take a look at the strace results with AllowOverride set to All.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
semop(1736730, {{0, -1, SEM_UNDO}}, 1) = 0
epoll_wait(42, {{EPOLLIN, {u32=3392874024, u64=140410168747560}}}, 2, 10000) = 1
accept4(4, {sa_family=AF_INET6, sin6_port=htons(55795), inet_pton(AF_INET6, "::ffff:192.168.0.212", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28], SOCK_CLOEXEC) = 43
semop(1736730, {{0, 1, SEM_UNDO}}, 1) = 0
getsockname(43, {sa_family=AF_INET6, sin6_port=htons(80), inet_pton(AF_INET6, "::ffff:192.168.0.212", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
fcntl(43, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(43, F_SETFL, O_RDWR|O_NONBLOCK) = 0
read(43, "GET /test.txt HTTP/1.0\r\nHost: ma"..., 8000) = 87
gettimeofday({1361543373, 140117}, NULL) = 0
stat("/var/www/magento.loc/test.txt", {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
open("/var/www/.htaccess", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/var/www/magento.loc/.htaccess", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/var/www/magento.loc/test.txt/.htaccess", O_RDONLY|O_CLOEXEC) = -1 ENOTDIR (Not a directory)
open("/var/www/magento.loc/test.txt", O_RDONLY|O_CLOEXEC) = 44
fcntl(44, F_GETFD) = 0x1 (flags FD_CLOEXEC)
fcntl(44, F_SETFD, FD_CLOEXEC) = 0
mmap(NULL, 12, PROT_READ, MAP_SHARED, 44, 0) = 0x7fb3c8bf9000
writev(43, [{"HTTP/1.1 200 OK\r\nDate: Fri, 22 F"..., 267}, {"hello world\n", 12}], 2) = 279
munmap(0x7fb3c8bf9000, 12) = 0
write(12, "192.168.0.212 - - [22/Feb/2013:0"..., 79) = 79
shutdown(43, 1 /* send */) = 0
poll([{fd=43, events=POLLIN}], 1, 2000) = 1 ([{fd=43, revents=POLLIN|POLLHUP}])
read(43, "", 512) = 0
close(43) = 0
read(6, 0x7fff95abfc1f, 1) = -1 EAGAIN (Resource temporarily unavailable)
close(44) = 0
semop(1736730, {{0, -1, SEM_UNDO}}, 1

You can clearly see the additional open() calls being made to try and discover the .htaccess file.  In this case the calls are completely superfluous because we have nothing there.  But even so we have a significant impact on static file throughput.

AllowOverride None

1
2
3
4
5
6
7
8
9
10
11
Concurrency Level: 10
Time taken for tests: 2.441 seconds
Complete requests: 10000
Failed requests: 0
Write errors: 0
Total transferred: 2790279 bytes
HTML transferred: 120012 bytes
Requests per second: 4096.02 [#/sec] (mean)
Time per request: 2.441 [ms] (mean)
Time per request: 0.244 [ms] (mean, across all concurrent requests)
Transfer rate: 1116.12 [Kbytes/sec] received

AllowOverride All

1
2
3
4
5
6
7
8
9
10
11
Concurrency Level: 10
Time taken for tests: 3.922 seconds
Complete requests: 10000
Failed requests: 0
Write errors: 0
Total transferred: 2790558 bytes
HTML transferred: 120024 bytes
Requests per second: 2549.42 [#/sec] (mean)
Time per request: 3.922 [ms] (mean)
Time per request: 0.392 [ms] (mean, across all concurrent requests)
Transfer rate: 694.76 [Kbytes/sec] received

The requests where AllowOverride was turned off were executed at 60% of the time of the ones where AllowOverride was turned on.

And remember, this is just the impact of file operations and does not take into account the time to reconfigure Apache during the course of these requests.

So the data would clearly show that there is a negative impact to having AllowOverride turned on in a production environment.  Instead it will generally be better to take those changes in .htaccess and place them in your httpd configuration file.

[UPDATE]

In fact Mike Willbanks says you should never do it.  I  agree with him, but I wouldn’t make as big a stink in dev as I would in prod.

 

Don’t modify index.php in Magento for multi-store configs

unless you really, really have to.  I did some quick Googling on this and found that a number of places recommend making changes to index.php.

In my humble opinion it is better to leave the index.php file alone and configure your stores via SetEnv in your Apache config, as is noted in the Magento wiki.

For example

<VirtualHost *:80>
SetEnv MAGE_RUN_CODE msv
ServerName kps.loc
DocumentRoot /var/www/magento.loc/magento
CustomLog logs/magento.access_log varnishcombined
ErrorLog logs/magento.error_log
</VirtualHost>

<VirtualHost *:80>
SetEnv MAGE_RUN_CODE base
ServerName magento.loc
DocumentRoot /var/www/magento.loc/magento
CustomLog logs/magento.access_log varnishcombined
ErrorLog logs/magento.error_log
</VirtualHost>

No code changes required and upgrades won’t break your modifications.

That is all.

indeyets/appserver-in-php · GitHub

indeyets/appserver-in-php · GitHub.

Lukas Smith responded on Twitter to a posting of mine I made on this blog about the possibility of having a precompiled bootstrap in PHP that would allow large sections of bootstrapping code to be bypassed, including autoloading, class definitions and certain objects.

His tweet linked to the link above which solves the problem, but in a different way.  It requires a middle layer that needs to be running to process these requests.  I believe that I have thought about doing something similar using ESI locally  which did not work out so well since Varnish processed ESI requests synchronously.

I have not played with this library and I would prefer a solution that is closer to the engine and doesn’t add another layer.  That said, this looks like an interesting project that seems worth taking a look at.

Would this be a dumb idea for PHP core?

So, I have been playing around with an idea in my head for a while, a few years now.  It really came along as we started seeing more and more PHP applications rely on bootstrapping.  For me it was as I saw more ZF applications becoming more and more complicated.  At the time I was consulting and I would see significant server resources consumed by bootstrapping the apps.  Loading config files, loading dependent classes, setting up dependencies, initializing ACL’s, and the list goes on and on.

One of the ways to negate the effect would be to cache a bootstrap object and then pull that object from the cache at the start of the request.  However, the problem is that unserialization can actually end up taking more time than the bootstrap process itself.

So, I was wondering.  Perhaps there would be a way to provide a cacheable state of the Zend Engine.

Perhaps it would look something like this.

init_engine_state(Callback $init);

What this would do is call the callback and after the callback returns, but before init_engine_state() returns, the engine would take a snapshot of everything except the superglobals.  This would include classes, objects and opcodes.  The next time a request comes in the callback would not be executed, but the state of the engine would be set to the state that it was in during the previous run of the callback.

Internally, what would happen before init_engine_state() returns is that all of the pertinent hash tables would be copied to a different memory block for the initial request.  Then the next time a request comes in, memory for the copied hashtables would overwrite the existing ones.  As noted earlier this could also include opcodes for files which would mean that the reams of autoloading function calls that typically happen could be completely bypassed.

I have seen legitimate cases where bootstrapping is taking 50% of the wallclock time.  Perhaps by providing an engine hook like this PHP performance could be dramatically improved.

 

… or maybe I’m just speaking out of my buttocks.

 

Please comment.  (On the idea, not my buttocks)

 

[UPDATE]

Here is an additional snippet of code that might help explain what I’m thinking of

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
<?php
 
$app = init_engine_state(function() {
    // This code would only be executed once
    require_once 'lib/code/Application.php';
    require_once 'lib/code/Autolaoder.php';
 
    $app = new Application();
    $app->createAutoloader();
    $app->bootstrap();
    return $app;
    /* 
    * At this point a snapshot would be taken of
    * - opcodes
    * - class definitions
    * - objects
    * 
    */
});
// No autoloading would need to be done
$adapter = Application::getAdapter();
$app->dispatch();

Post Navigation