Magento Performance on PHP 5.3, 5.4 and 5.5RC3

magento-php-line

I woke up this morning with a burning desire to do load tests.  Actually, I woke up with a burning desire to not do the same thing I did yesterday and needed a slight change, so I decided to do a load test.  I wanted to see what the performance difference for Magento was between PHP versions 5.3, 5.4 and 5.5RC3.

As you may know, Magento only supports 5.3 and 5.2.  Personally, I would not even be thinking about running any kind of remotely serious ECommerce site on PHP 5.2.  But with work on PHP 5.5 pushing towards GA it means that some time soon support for 5.3 willing be going away.  This might be a bit of a problem for software that isn’t supported on 5.4+.  One of the reasons for this is that there are bugs in PHP that are holding back support.  I don’t know what they all are but this one regarding XML processing is one.  There might be others, but that’s one that I know of.

But enough about bugs, what about performance.  For PHP 5.3 and 5.4 I used Zend Server with Optimizer+.  This is partially because I use Zend Server on my local machine and also because it would give a good comparison with PHP 5.5 since Optimizer+ has been open sourced and will be included.

The configure settings I used for PHP 5.5 is this

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
./configure  
  --with-config-file-path=/etc/php-5.5
  --with-config-file-scan-dir=/etc/php-5.5/php.d 
  --disable-debug 
  --enable-inline-optimization 
  --disable-all 
  --enable-libxml 
  --enable-session 
  --enable-xml 
  --enable-hash 
  --with-pear 
  --with-layout=GNU 
  --enable-filter 
  --with-pcre-regex 
  --with-zlib 
  --enable-simplexml 
  --enable-dom 
  --with-openssl 
  --enable-pdo 
  --with-pdo-sqlite 
  --with-readline 
  --with-iconv 
  --with-sqlite3 
  --disable-phar 
  --enable-xmlwriter 
  --enable-xmlreader 
  --enable-mysqlnd 
  --enable-json 
  --with-gd 
  --enable-soap 
  --with-curl 
  --with-apxs2 
  --enable-ctype 
  --with-pdo-mysql 
  --prefix=/opt/php-5.5 
  --enable-opcache

My 5.5 opcode cache settings were this

1
2
3
4
5
6
7
zend_extension=opcache.so
opcache.memory_consumption=128
opcache.interned_strings_buffer=8
opcache.max_accelerated_files=4000
opcache.revalidate_freq=60
opcache.fast_shutdown=1
opcache.enable_cli=1

I make no guarantees that these are the optimal settings.

I used Siege as the load test running against a single URL that had some relatively complex logic in it.  I was using Magento Enterprise 1.13, but had full page caching turned off.  I ran with only 4 concurrent sessions since that’s how many CPU’s are on the machine I was testing.  This was not a test of server capacity, but of raw performance.  I suppose that a single concurrent session would have been better, but se la vis.  I didn’t see a drop in response time until I went to 5 concurrent sessions anyway so I doubt this was an issue.

The first chart is the throughput per second, so higher is better.

magento-php-line

 

As you can see, PHP 5.4 and 5.5 faired better than 5.3.  5.5 faired just a little better than 5.4.

This next chart shows the slowest, fastest and average times for each.  Lowe is better.

magento-php-bar

 

The slowest time is not really all that interesting since every load test will have a few hickups.  I suspect that if I took the 95th percentile that it would look pretty close to the average.  But overall 5.4 and 5.5RC3 did better in all the data points that matter.

Now to get those bugs fixed so Magento can support those two…

How configuration works in Magento

One of the things that I have been doing over the past several months has been to write up how Magento works on the inside.  These have primarily been for my own benefit, but could probably equate a small book by now.  I was doing this mostly to make sure that what I thought I understood about Magento was actually right and so I intentionally have been doing a pretty deep dive on Magento’s internal workings.

It has been tremendously useful exercise for me and so I wanted to share it.  The first one was about configuration, which is really one of the important linchpins of the system.  EVERYTHING depends on the configuration to some degree. On a typical setup a compiled configuration object in the cache is well north of 300k of XML.

This is a long write-up.  That’s because it’s pretty complete.  Yes, I’m sure I missed some things but this will probably teach you more about Magento configuration than you ever wanted to know.  Also, it is a little on the raw side, so I beg your forgiveness on that.

Here we go.

Order of operations

  1. Load local.xml

  2. Load module XMLs from etc/modules

  3. Load individual config.xml files for modules

  4. Reload local.xml to make sure its values have not been overridden

  5. Load configuration from the DB

  6. Merge default scope values into config

  7. Copy default values into websites nodes

  8. Copy website values into store nodes

  9. Cache configuration and config sections in cache

Description of operations

Configuration starts in Mage::run() when Mage::_setConfigModel is called.  This method allows a developer to use a class other than the default Mage_Core_Model_Config instance.  However the class name that a developer must extend Mage_Core_Model_Config and be specified in the config_model option.  Some reasons for using this could include getting configuration from alternate data sources such as a web service or perhaps an LDAP server, but generally the default would be sufficient.

When Mage_Core_Model_Config is created it sets its own cache ID as “config_global”.  It also creates an object called Mage_Core_Config_Options which is used for creating the base options for the application such as the location of the etc, code and design directories.  Additionally it creates an instance of Mage_Core_Model_Config_Base which is used to clone configuration nodes to help with merging XML configuration files.

The parent of the Mage_Core_Model_Config class is Mage_Core_Model_Config_Base and its parent constructor is called.  In the constructor the class Mage_Core_Model_Config_Element  is set as the elementClass.  Because it is set here it does not seem like it can be overridden using the base configuration object.  If you want to have a different element class for some unknown reason you will need to do that by creating a new configuration base class and change it that way.  In other words, the element class is not configurable, and you’ll probably never want it to be.  But if you really want to you can pass the name of the config module in the Mage::run(”,”,$options) $options array as an array key ‘config_model’.  But, again, very seldom would you really want to do that.

The parent of Mage_Core_Model_Config_Base is Varien_Simplexml_Config.  This is the most basic class for handling configuration and since Mage::_setConfigModel requires that a configuration object extend Mage_Core_Config_Base, a child class of Varien_Simplexml_Config, this class will be loaded for any configuration activity that needs to be completed.

The constructor of Varien_Simplexml_Config accepts a sourceData parameter.  This parameter can be null, contain a filename or an XML string.  It can also be an instance of of Varien_Simplexml_Element which is passed into setXml() and will overwrite any previously calculated configuration options.

If the option passed is a file name then loadFile() will be called.  The file data is loaded via file_get_contents() after which processFileData() is called.  This method does not do anything in the base configuration object but could theoretically be used to massage data, perhaps like having a JSON configuration file which would be parsed and returned as an XML string.  Or perhaps there could be a call to a third party web service which could provide additional configuration options to be merged based on the configuration text.  However, if you are passing it to Mage_Core_Model_Config::__construct() you will run into a problem with anything but an array.  This is because Mage_Core_Model_Config creates a Mage_Core_Model_Config_Options class internally for some basic system options, like file locations.   This class does not understand anything but arrays and so you end up with a misconfigured system if you pass in a string.  So, stick to arrays.

Once the file has been loaded or if the sourceData parameter was XML, the loadString() method is called.  This method takes the rawXML string as a parameter and passes it to simplexml_load_string() along with _elementClass.  This will cause the SimpleXML extension to parse the XML document into elements of the type specified.  After it has been parsed, Varien_Simplexml_Config checks to make sure that the element is an instance of Varien_Simplexml_Element.  This will be to validate that the XML is actually parsable and that any element class that is specified has extended the *_Element class.

At the end of this process the configuration object is stored in the static variable Mage::$_config and is directly accessible via the Mage::getConfig() method.

After Mage::_setConfigModel() is run the previously created instance of Mage_Core_Model_App is referenced and its run() method is called.  In a default scenario this will be the first time that the configuration is actually loaded.  The run() method calls another method called baseInit().  The run() method takes an array of options that can be passed into baseInit() but in the default scenario this is not done.

Mage_Core_Model_App::baseInit() retrieves the configuration object from Mage::getConfig() and stores it locally in the object.  If there are options provided in the bootstrap this is where they are added to the configuration object.  The baseInit() method calls the loadBase() method which retrieves the etc directory location from the configuration options which were set earlier and iterates over any .xml files in the directory and merges them into $this.  During the iteration the _prototype instance is cloned and the clone is used to load the XML file.  The resulting object is then merged into the current configuration node.

That merging is accomplished by calling the extend() method.  This method retrieves the local instance of the Varien_Simplexml_Element, or the specified child node.  This node’s extend() method is called which takes the to-be-merged element’s child nodes and adds each to itself.  This is done by iterating over the nodes and explicitly adding each node and its attributes.  This extend() method is a feature that is not part of the main PHP XML handling functionality.  If you try to replicate this functionality on your own it will take a lot of time and code.  Just steal Varien_Simplexml_*.  :-)

Once all of the files have been loaded the Mage_Core_Model_Config object checks to see if local.xml has been loaded and marks a flag.

After Mage_Core_Model_Config::_initBaseConfig() has finished the cache is loaded in _initCache().  Cache options can be provided with an option key of ‘cache’.  Any options that are provided will overwrite configuration values.  The cache node is ‘global/cache’ and is hard coded into the application.

Mage_Core_Model_Config::getNode() takes three parameters, all of which are optional, to allow the developer to retrieve the most pertinent configuration node.  The parameters are path, scope and scope code.

The scope and scope code allow developers to retrieve like content from different “contexts” such as stores or websites.   Scope for stores and websites can be provided as plural or singular values.  If a scope is provided that is not default, store or website an exception will be thrown.

If a valid path is provided an object of type Mage_Core_Model_Config_Element is returned.  Because this object ultimately extends SimpleXMLElement it can be acted upon like any SimpleXML element.

Up until this point none of the module configurations have been loaded.  The module config may be cached whereas the app/etc/*.xml configuration is not.  This caching is determined by the settings for _allowedCacheOptions in Mage_Core_Model_Cache.  This is set by querying the core_cache_option table which states which areas can be cached (do SELECT * FROM core_cache_option and compare it to the Cache Configuration screen in the UI).  This is where the cache enablement for things like EAV, block_html or config are stored.

The module configuration files will be loaded in the order of Mage_All, Mage_* and then any external configuration files.  There is the ability to run a limited version of Magento with only specific modules allowed, but that would seem to be implemented by creating a custom boostrap process.

The module XML files are merged into a temporary variable which is processed locally.  It will be iterated over during which the dependencies of each module will be calculated in the method Mage_Core_Model_Config::_sortModuleDepends().  The module list is resorted and if any module dependencies for active modules cannot be satisfied an exception is thrown.  The module configurations are sorted in reverse order, presumably because of the order that the files are loaded.  The module configurations are then iterated over in normal order so that the base dependency of Mage_Core, which has no dependencies, is done first and the module dependencies are processed in order of greater to less importance.  If a module with a dependency is defined but cannot be satisfied an exception is thrown at this point too.

Once the dependencies have been calculated a new configuration model is instantiated which will be merged into the mainline config.  The sorted module dependency is iterated over and is retrieved from the local configuration object and merged into the sorted configuration.  Once the module configuration has been iterated over the local configuration object is merged into the mainline configuration object.

At this point the individual configuration files for each of the modules will be read.  Prior to reading them Magento will check to see if modules in the local code pool can be configured.  If they are disabled then the directory for the local code pool will be removed from the include_path.

All the active modules are then iterated over and a quick check is done to make sure that if local modules are not allowed that they are not included, though it looks like this check is only done on the configuration and so technical something could be installed in the local code pool but state that they are in base or core and bypass this.

It will load the config.xml from the etc directory of each module and merge it into the main configuration.

Once the configuration object has done that it will re-load the app/etc/local.xml file to make sure that its directives have not been overridden.

The final part of the module configuration loading task if to find any configuration elements that extend any others.  If there are (and I did not find any in the default configuration) the configuration array will copy the elements into the element being extended.

Once the configuration files have all been loaded and processed any system updates are executed (which I will not cover here).  From there configuration options are loaded from the database via the loadToXml() method in Mage_Core_Model_Resource_Config, referenced from Mage_Core_Model_Config::loadDb().

When retrieving the database adapter for reading the configuration.  It first checks the write adapter to see if it is in the middle of a transaction.  If it is in a transaction it returns the write adapter and if it is not it returns the read adapter.  This feature is part of Mage_Core_Model_Resource_Db_Abstract and as such is available to any resource that extends it.  Kinda neat.

The first thing that loadToXml() does is load the website data into the configuration object.  It does this by querying the core_website table for the website_id, code and name columns.  This data is inserted into the configuration at /websites.

After the websites have been loaded the store configuration is inserted.  The core_store table is queried for the store_id, code, name and website_id for the store and stored in the /stores location.

After the stores have been loaded the core configuration is loaded.  This is done by querying the core_config_data.  In the result set first the configuration items that have a scope of “default” are entered into the configuration.  Then each of the websites configurations are iterated over and the values for the default scope are copied into each.

After the values for the default scope have been copied in then the nodes for the individual scope are copied into their respective paths.  Yes, the configuration is copied to the various contexts by default.  If you have a lot of these contexts you could end up with a very large configuration document.

After the configuration values have been copied into the web site configuration each of the stores for each of the websites have the values for their “owner” website copied into them.  Then the configuration items for each individual individual store are copied into the store configuration node.

One interesting thing to note about this process is that if a website or store scope is the configuration but not in the list of stores or websites their configuration options are automatically deleted from the configuration.

At this point the configuration is stored in the cache as long as the cache is allowed to stored.  The saving mechanism iterates over each of the sections to check the level of recursion required for each of the specified sections.  This allows the system to cache individual sub-sections instead of having to cache the whole section.  By default this is limited to stores so that each store can be referenced individually.  Given the size of the configuration document, separating individual store config elements is a good thing.  Once the specified sections have been added to the parts array the rest of the configuration is stored in the generic cache tag.

At this point the configuration has been loaded and cached.  There is more to be done before the request can be dispatched but that will be covered in another document.

Retrieving configuration is done by requesting a path that corresponds to the XML structure similar to XPath, but without many of the options that XPath provides (thankfully).  This is done by calling the getNode() method on the configuration object that the developer is working with.  Calling this method will return the Varien_Simplexml_Element object that corresponds with the requested node or false if the node does not exist (I would argue null would be more appropriate, but se la vis).

To retrieve the appropriate configuration node the developer has several different options available to them.  I would argue that it is best to get configuration object from the most specific components.  Since the specificity goes from default -> website -> store it would be preferred to get a configuration node of “test/data/value” from Mage::getStore()->getConfig()->getNode(‘test/data/value’) it one is working with storefront-based functionality.  If the logic is working on the website level then it would be Mage::getWebsite()->getConfig()->getNode(‘test/data/value’).  In other words, when working within the store or website contexts the context of the configuration option requested should match rather than Mage::getConfig()->getNode(“stores/{$storeId}/test/data/value”).

Testing GlusterFS for Magento

nfs-bad

I am not a fan of NFS for production information.  NFS is great for aggregating data from across multiple different machines, storing deployment files and other such administrative things.  Serving static content?  No.  I haven’t blogged about it but I have talked about it several times at conferences.  NFS, as a static content distribution mechanism is horribly slow.  Here’s the chart to prove it.

nfs-bad

Never mind the issues that you need to take into consideration.

I know a lot of Magento deployments use NFS so they can have static content accessible from multiple different servers and have it all “up to date”.  But, like I said, I am not a fan of using NFS for production and so while it works, in my opinion, it is not optimal.

So I wanted to test GlusterFS to see provided a more optimal solution.  So I wrote a quick test script to try it out.  I tried creating multiple files, reading from one file multiple times, reading from the sequence of files multiple times and then deleting the files.  The Gluster configuration is vanilla, as is NFS.  Here’s the code I used to test.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
$fs = array('/kschroeder/gfstest', '/mnt/gfstest', '/var/tmp/gfstest');
 
foreach ($fs as $f) {
  @mkdir($f, 0755, true);
  $startTime = microtime(true);
  for ($i = 0; $i < 500; $i++) {
    file_put_contents($f . '/test-' . $i, str_pad('', 0, 10240));
  }
  $elapsed = microtime(true) - $startTime;
  echo "$f write $elapsed\n";
 
  $startTime = microtime(true);
  $files = glob("$f/*");
  $file = array_shift($files);
  for ($i = 0; $i < 10000; $i++) {
    file_get_contents($file);
  }
  $elapsed = microtime(true) - $startTime; 
  echo "$f read-one $elapsed\n"; 
 
  $startTime = microtime(true);
  foreach (glob("$f/*") as $file) {
    file_get_contents($file);
  }
  $elapsed = microtime(true) - $startTime; 
  echo "$f read-all $elapsed\n";
 
  $startTime = microtime(true);
  foreach (glob("$f/*") as $file) {
    unlink($file);
  }
  $elapsed = microtime(true) - $startTime;
  echo "$f delete $elapsed\n";
 
}

/kschroeder/gfstest was an NFS mount, /mnt/gfstest was a GlusterFS mount and /var/tmp/gfstest was local.

Here are the results.

not good enough

This is the elapsed time to conduct the individual tests so lower is better.  GlusterFS did better than NFS on all the different operations.  However, when compared with local speeds there was no comparison.

GlusterFS is definitely better, IMHO, than NFS.  It does a lot of things like clustering, striping and replication out of the box and is very easy to configure all of those options without having to worry about magic or other components to make it work.  For that reason I would definitely put it on my list of preferred solutions for handling static web content.

But what I want is a distributed, replicated static file storage mechanism that does client-side caching so my static file read performance is at least almost as good as local.  And if my main file server goes down I don’t want my entire web site to go down.  And it needs to be easy to install and manage so that merchants, who often do not have a lot of system administration experience, are able to do basic SA tasks.  As much of an improvement that GlusterFS is, out of the box it doesn’t seem to be solve the problem fully.

The search continues (and I do have some other possibilities I want to test).

 

What I would love in PHP-FPM

I love most things about PHP, but what I don’t like is that in order for me to do any kind of asynchronous processing I need to create an infrastructure.  In other words, I need to build a queuing daemon or build some kind of interface.

It really shouldn’t be that much work for what is a simple task in many other languages.

So it would be really cool if PHP-FPM had a FIFO/delayed queue where you could inject a FastCGI request into the queue and do either fire and forget or allow the executing process to wait on a queue selector.  So it would look kind of like this

1
2
3
4
5
$j1 = new FpmRequest('/some/url', 'POST', array('var' => 1, 'var2' => 2));
 
$q = new FpmQueue();
$q->addJob($j1);
$q->execute();

or if you want to wait for the response, this

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$j1 = new FpmRequest('/some/url', 'POST', array('var' => 1, 'var2' => 2));
$j2 = new FpmRequest('/some2/url2', 'POST', array('var' => 1, 'var2' => 2));
$j3 = new FpmRequest('/some3/url3', 'POST', array('var' => 1, 'var2' => 2));
$q = new FpmQueue();
$q->addJob($j1);
$q->addJob($j2);
$q->addJob($j3);
 
$q->execute();
$q->wait();
 
echo $j1->getOutput();
echo $j2->getOutput();
echo $j2->getOutput();

It would be nice for the Apache SAPI to do this as well, so I could debug the requests easier (I use Zend Server which, ATM, only supports Apache).  But it would seem that PHP-FPM would have an easier time of doing this because it manages its own resources it could do things like maintain a separate pool reserved for queued requests.

Maybe I have unique use cases or I just like making things more complicated for myself.  But it would be really nice to have some kind of queuing work out of the box.

EAV Properties for Magento

The only thing I hate more than bad code completion is no code completion.  When working with PHP arrays for configuration there are often options that you need to remember to properly configure the object, factory or whatever, that you are using.

What I really like to do for my own code is that when I do have some kind of array-based configuration I like to have class constants that define what options there are for me to set.  Because they are constants they cut down on fat-finger problems and also take out a lot of the guesswork.

Magento is pretty good about doing this, but one place where I have not found much by way of constants is in the EAV system.  There are a number of EAV properties that you can set.  Here is a table of the ones I found.

Config KeyMerged KeyDefault Value
backendbackend_model
typebackend_typevarchar
tablebackend_table
frontendfrontend_model
inputfrontend_inputtext
labelfrontend_label
frontend_classfrontend_class
sourcesource_model
requiredis_required1
user_definedis_user_defined0
defaultdefault_value
uniqueis_unique
notenote
globalis_global1 (Mage_Catalog_Model_Resource_Eav_Attribute:: SCOPE_GLOBAL)
input_rendererfrontend_input_renderer
visibleis_visible1
searchableis_searchable0
filterableis_filterable0
comparableis_comparable0
visible_on_frontis_visible_on_front0
wysiwyg_enabledis_wysiwyg_enabled0
is_html_allowed_on_frontis_html_allowed_on_front0
visible_in_advanced_searchis_visible_in_advanced_search0
filterable_in_searchis_filterable_in_search0
used_in_product_listingused_in_product_listing0
used_for_sort_byused_for_sort_by0
apply_toapply_to0
positionposition0
is_configurableis_configurable1
used_for_promo_rulesis_used_for_promo_rules0

Kind of a lot.  And as you get more experience you will memorize them.  But what a pain in the butt.

So I created a new GitHub repository that allows you to simply use some class constants instead of hand-typed array keys for EAV configuration.  Most of the code I have seen for EAV configuration uses inputted key values instead of constants and my Google searches have not yielded much.  So my presumption, possibly incorrect, is that this kind of helper class does not exist.

Using the class is very, very simple.  You only need to put it in code from GitHub in code/local/Eschrade/Util.  The autoloader should take care of the rest.  No need to manage a config file.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$installer = Mage::getResourceModel('catalog/setup','default_setup');
 
$config = array(
	Eschrade_Util_Statics::EAV_TYPE_ALIAS 	=> Eschrade_Util_Statics::EAV_TYPE_TEXT,
	Eschrade_Util_Statics::EAV_INPUT_ALIAS	=> Eschrade_Util_Statics::EAV_INPUT_BOOLEAN,
	Eschrade_Util_Statics::EAV_LABEL_ALIAS 	=> 'Some Attribute',
);
 
$installer->addAttribute(
	Mage_Catalog_Model_Product::ENTITY,
	'some_attribute',
	$config
);
 
$installer->updateAttribute(
	Mage_Catalog_Model_Product::ENTITY,
	'some_attribute',
	Eschrade_Util_Statics::EAV_LABEL,
	'Some Other Attribute'
);

Note that for the addAttribute() call I use the name  + _ALIAS.  So ‘label’ for adding an attribute is Eschrade_Util_Statics::EAV_LABEL_ALIAS but for updating an attribute it is Eschrade_Util_Statics::EAV_LABEL.

Also note that this class is not complete.  I will make additions to it as I have time and I hope that others who see holes make push requests to add more things.

Please, please fork it, make additions and fix errors you may see.  Personally, I would rather be able to choose from a list of options than having to manually type things in.  It makes things much cleaner and predictable.  So I hope that this will make things a little easier for you.

The First Annual Report on Programmer Ass-hattery

The 2013 Kevin Schroeder Report on Programmer Ass-hattery

Taking a cue from TIOBE, where you can take Google search results and make them mean anything that you want them to, I decided that I was going to try an experiment and see if I could discern, from Google search results, how likely a programmer for a given language would engage in ass-hattery.

In short, this is the most important programming index you will ever see.  And unlike many other indexes I will put my hastily concocted formula up front and center for your enjoyment and ridicule.

The formula is

(“Google search for ‘X rocks’ / ”Google search for ‘X sucks’) * (language score sum / X language score)

I got Google results for PHP, Ruby On Rails, Java, Python, C, C++, Objective-C, C# and JavaScript.  I took the language score from Lang Pop.  The language popularity values for each of the programming languages is

PHP Ruby On Rails Java Python C C++ Objective-C C# JavaScript
5591 667 6930 4240 9948 5584 352 429 1572

The sum of all of these is 35313.

To illustrate the formula the Python Programmer Ass-hattery score is 20.6387.  ”Python rocks” results are 11,300, “Python sucks” is 4560.  So the formula would look like this.  (11300 / 4560) * (35313 / 4240).  Generally, the higher the score, the more likely you will want them to wear there ass as a hat.

The theory is that the louder one talks compared to their actual popularity the more likely they will engage in ass-hattery.

The results for the First Annual Report on Programmer Ass-hattery are:

The 2013 Kevin Schroeder Report on Programmer Ass-hattery

It’s math so it can’t be wrong.

Comments proving ass-hattery will either be deleted or ridiculed.  This was meant as fun (unlike some other indexes).

For the last time, the file system is not slow!!

fs-benchmarks

Having started working at Magento I have been making myself more familiar with many of the different parts of the community.  I have spent a fair amount of time over the past several weeks trying to understand how people work with Magento and what their problems are.

One of the things that often comes up is speed.  And there are lists of things that people can do to try and make Magento faster.  But there’s something that bugs me in many of these lists.  Often people say that in order to make your Magento installation run faster you need to put certain things on tmpfs or RAM.

Makes sense, right?  The disk is slow, RAM is fast and so a RAM drive must be fast.  Right?

Nope.

Yes.  The disk is slow.  But that does not mean that the file system is.  The disk is only part of the file system.  The file system includes this nifty thing called a disk block cache.  The file system will cache often-used disk blocks in RAM.

What?  The file system uses RAM?  Yep.  When you type in “top” and you see that value for “cached”?  That is RAM that the operating system is using to store disk blocks.

The result of that caching is this chart.  It measures the throughput in number of requests per second of a static resource via Nginx over HTTP load tested from a remote host on the same network.

Server Throughput (HTTP Requests per Second)

fs-benchmarks
To conduct the test I did 3 test runs for each of the types of file systems.  A physical ext3 disk, tmpfs and a RAM drive.  I ran 10,000 iterations via 100 concurrent requests.  The results of the test show that the physical media was actually the fastest!

I was actually a little surprised by that.  I was expecting that the physical (local) file system would only be keeping up, instead it was faster.  I would be willing to chalk that win up to entropy in the system but the assertion that a RAM drive or tmpfs is faster than the file system is clearly not true.

Don’t get me wrong.  RAM drives are great if you want to explicitly define a file system which WILL stay in RAM.  However, I side with Linus Torvalds (when he was talking about O_DIRECT) that the purpose of an operating system is to manage a lot of this for you.  You might be able to get some better results for RAM or tmpfs from some tuning but it would seem to be a micro-optimization at best or a giant waste of time at worst.

No-.htaccess httpd.conf file for Magento

A couple of days ago I wrote a blog post on how why you should not use .htaccess files, or AllowOverride != All, on a production web server.  What you should do is place the .htaccess configuration information into your httpd.conf file instead.

So of course I was asked what that would look like.  So here it is.  I took all of the .htaccess settings, stripped some of the superfuous ones and removed the comments ( for clarity :-) ) and here is what you have.  Customize for your own site, of course.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
<VirtualHost *:80>
	ServerName magento.loc
	DocumentRoot /var/www/html
	DirectoryIndex index.php
 
	<Directory /var/www/html/var/>
		Order deny,allow
		Deny from all
	</Directory>
 
	<Directory /var/www/html/>
		AllowOverride None
		<IfModule mod_php5.c>
 
		    php_value memory_limit 128M
		    php_value max_execution_time 18000
 
		    php_flag magic_quotes_gpc off
		    php_flag session.auto_start off
 
		</IfModule>
 
		<IfModule mod_security.c>
		    SecFilterEngine Off
		    SecFilterScanPOST Off
		</IfModule>
 
		<IfModule mod_ssl.c>
		    SSLOptions StdEnvVars
		</IfModule>
		<IfModule mod_rewrite.c>
 
		    Options +FollowSymLinks
		    RewriteEngine on
 
		    #RewriteBase /magento/
		    RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
		    RewriteCond %{REQUEST_URI} !^/(media|skin|js)/
		    RewriteCond %{REQUEST_FILENAME} !-f
		    RewriteCond %{REQUEST_FILENAME} !-d
		    RewriteCond %{REQUEST_FILENAME} !-l
		    RewriteRule .* index.php [L]
 
		</IfModule>
 
		    AddDefaultCharset Off
		    #AddDefaultCharset UTF-8

		<IfModule mod_expires.c>
		    ExpiresDefault "access plus 1 year"
		</IfModule>
	    Order allow,deny
	    Allow from all
	</Directory>
 
	<Directory /var/www/html/includes/>
		Order deny,allow
		Deny from all
	</Directory>
 
	<Directory /var/www/html/errors/>
		<FilesMatch "\.(xml|phtml)$">
		    Deny from all
		</FilesMatch>
	</Directory>
 
	<Directory /var/www/html/pkginfo/>
		Order deny,allow
		Deny from all
	</Directory>
 
	<Directory /var/www/html/app/>
		Order deny,allow
		Deny from all
	</Directory>
 
	<Directory /var/www/html/lib/>
		Order deny,allow
		Deny from all
	</Directory>
 
	<Directory /var/www/html/downloader/>
		<IfModule mod_deflate.c>
 
		    RemoveOutputFilter DEFLATE
		    RemoveOutputFilter GZIP
 
		</IfModule>
 
		<Files ~ "\.(cfg|ini|xml)$">
		    order allow,deny
		    deny from all
		</Files>
	</Directory>
 
	<Directory /var/www/html/downloader/template/>
		Order deny,allow
		Deny from all
	</Directory>
 
	<Directory /var/www/html/media/>
		Options All -Indexes
		<IfModule mod_php5.c>
			php_flag engine 0
		</IfModule>
 
		AddHandler cgi-script .php .pl .py .jsp .asp .htm .shtml .sh .cgi
		Options -ExecCGI
 
		<IfModule mod_rewrite.c>
		    Options +FollowSymLinks
		    RewriteEngine on
		    RewriteCond %{REQUEST_FILENAME} !-f
		    RewriteRule .* ../get.php [L]
		</IfModule>
	</Directory>
 
	<Directory /var/www/html/media/customer/>
		Order deny,allow
		Deny from all
	</Directory>
 
	<Directory /var/www/html/media/downloadable/>
		Order deny,allow
		Deny from all
	</Directory>
 
</VirtualHost>

Why you should not use .htaccess (AllowOverride All) in production

Commonly known as .htaccess, AllowOverride is a neat little feature that allows you to tweak the server’s behavior without modifying the configuration file or restarting the server.  Personally, I think this is great for development purposes.  It allows you to quickly test various server configurations without needing to mess with restarting the server.  It helps you be more (buzzword alert!) agile.

Beyond the obvious security problems of allowing configuration modifications in a public document root there is also a performance impact.  What happens with AllowOverride is that Apache will do an open() call on each parent directory from the requested file onward.

To demonstrate this I used a program called strace which checks for system calls and gives you a list of each system call that is made.

First we’ll take a look at the strace with AllowOverride set to None.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
semop(1638426, {{0, -1, SEM_UNDO}}, 1) = 0
epoll_wait(42, {{EPOLLIN, {u32=3507213864, u64=139813282633256}}}, 2, 10000) = 1
accept4(4, {sa_family=AF_INET6, sin6_port=htons(55755), inet_pton(AF_INET6, "::ffff:192.168.0.212", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28], SOCK_CLOEXEC) = 43
semop(1638426, {{0, 1, SEM_UNDO}}, 1) = 0
getsockname(43, {sa_family=AF_INET6, sin6_port=htons(80), inet_pton(AF_INET6, "::ffff:192.168.0.212", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
fcntl(43, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(43, F_SETFL, O_RDWR|O_NONBLOCK) = 0
read(43, "GET /test.txt HTTP/1.0\r\nHost: ma"..., 8000) = 87
gettimeofday({1361542861, 683952}, NULL) = 0
stat("/var/www/magento.loc/test.txt", {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
open("/var/www/magento.loc/test.txt", O_RDONLY|O_CLOEXEC) = 44
fcntl(44, F_GETFD) = 0x1 (flags FD_CLOEXEC)
fcntl(44, F_SETFD, FD_CLOEXEC) = 0
mmap(NULL, 12, PROT_READ, MAP_SHARED, 44, 0) = 0x7f28cfc74000
writev(43, [{"HTTP/1.1 200 OK\r\nDate: Fri, 22 F"..., 267}, {"hello world\n", 12}], 2) = 279
munmap(0x7f28cfc74000, 12) = 0
write(12, "192.168.0.212 - - [22/Feb/2013:0"..., 79) = 79
shutdown(43, 1 /* send */) = 0
poll([{fd=43, events=POLLIN}], 1, 2000) = 1 ([{fd=43, revents=POLLIN|POLLHUP}])
read(43, "", 512) = 0
close(43) = 0
read(6, 0x7fff79e2d7cf, 1) = -1 EAGAIN (Resource temporarily unavailable)
close(44) = 0
semop(1638426, {{0, -1, SEM_UNDO}}, 1

Now let’s take a look at the strace results with AllowOverride set to All.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
semop(1736730, {{0, -1, SEM_UNDO}}, 1) = 0
epoll_wait(42, {{EPOLLIN, {u32=3392874024, u64=140410168747560}}}, 2, 10000) = 1
accept4(4, {sa_family=AF_INET6, sin6_port=htons(55795), inet_pton(AF_INET6, "::ffff:192.168.0.212", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28], SOCK_CLOEXEC) = 43
semop(1736730, {{0, 1, SEM_UNDO}}, 1) = 0
getsockname(43, {sa_family=AF_INET6, sin6_port=htons(80), inet_pton(AF_INET6, "::ffff:192.168.0.212", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
fcntl(43, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(43, F_SETFL, O_RDWR|O_NONBLOCK) = 0
read(43, "GET /test.txt HTTP/1.0\r\nHost: ma"..., 8000) = 87
gettimeofday({1361543373, 140117}, NULL) = 0
stat("/var/www/magento.loc/test.txt", {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
open("/var/www/.htaccess", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/var/www/magento.loc/.htaccess", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
open("/var/www/magento.loc/test.txt/.htaccess", O_RDONLY|O_CLOEXEC) = -1 ENOTDIR (Not a directory)
open("/var/www/magento.loc/test.txt", O_RDONLY|O_CLOEXEC) = 44
fcntl(44, F_GETFD) = 0x1 (flags FD_CLOEXEC)
fcntl(44, F_SETFD, FD_CLOEXEC) = 0
mmap(NULL, 12, PROT_READ, MAP_SHARED, 44, 0) = 0x7fb3c8bf9000
writev(43, [{"HTTP/1.1 200 OK\r\nDate: Fri, 22 F"..., 267}, {"hello world\n", 12}], 2) = 279
munmap(0x7fb3c8bf9000, 12) = 0
write(12, "192.168.0.212 - - [22/Feb/2013:0"..., 79) = 79
shutdown(43, 1 /* send */) = 0
poll([{fd=43, events=POLLIN}], 1, 2000) = 1 ([{fd=43, revents=POLLIN|POLLHUP}])
read(43, "", 512) = 0
close(43) = 0
read(6, 0x7fff95abfc1f, 1) = -1 EAGAIN (Resource temporarily unavailable)
close(44) = 0
semop(1736730, {{0, -1, SEM_UNDO}}, 1

You can clearly see the additional open() calls being made to try and discover the .htaccess file.  In this case the calls are completely superfluous because we have nothing there.  But even so we have a significant impact on static file throughput.

AllowOverride None

1
2
3
4
5
6
7
8
9
10
11
Concurrency Level: 10
Time taken for tests: 2.441 seconds
Complete requests: 10000
Failed requests: 0
Write errors: 0
Total transferred: 2790279 bytes
HTML transferred: 120012 bytes
Requests per second: 4096.02 [#/sec] (mean)
Time per request: 2.441 [ms] (mean)
Time per request: 0.244 [ms] (mean, across all concurrent requests)
Transfer rate: 1116.12 [Kbytes/sec] received

AllowOverride All

1
2
3
4
5
6
7
8
9
10
11
Concurrency Level: 10
Time taken for tests: 3.922 seconds
Complete requests: 10000
Failed requests: 0
Write errors: 0
Total transferred: 2790558 bytes
HTML transferred: 120024 bytes
Requests per second: 2549.42 [#/sec] (mean)
Time per request: 3.922 [ms] (mean)
Time per request: 0.392 [ms] (mean, across all concurrent requests)
Transfer rate: 694.76 [Kbytes/sec] received

The requests where AllowOverride was turned off were executed at 60% of the time of the ones where AllowOverride was turned on.

And remember, this is just the impact of file operations and does not take into account the time to reconfigure Apache during the course of these requests.

So the data would clearly show that there is a negative impact to having AllowOverride turned on in a production environment.  Instead it will generally be better to take those changes in .htaccess and place them in your httpd configuration file.

[UPDATE]

In fact Mike Willbanks says you should never do it.  I  agree with him, but I wouldn’t make as big a stink in dev as I would in prod.

 

Don’t modify index.php in Magento for multi-store configs

unless you really, really have to.  I did some quick Googling on this and found that a number of places recommend making changes to index.php.

In my humble opinion it is better to leave the index.php file alone and configure your stores via SetEnv in your Apache config, as is noted in the Magento wiki.

For example

<VirtualHost *:80>
SetEnv MAGE_RUN_CODE msv
ServerName kps.loc
DocumentRoot /var/www/magento.loc/magento
CustomLog logs/magento.access_log varnishcombined
ErrorLog logs/magento.error_log
</VirtualHost>

<VirtualHost *:80>
SetEnv MAGE_RUN_CODE base
ServerName magento.loc
DocumentRoot /var/www/magento.loc/magento
CustomLog logs/magento.access_log varnishcombined
ErrorLog logs/magento.error_log
</VirtualHost>

No code changes required and upgrades won’t break your modifications.

That is all.

Post Navigation