(Starting) Using Dependency Injection in Magento 2

One of the biggest switches you will experience when moving from Magento 1 to Magento 2 is inversion of control.  This is a very different concept to get used to but once you have it you will be a very happy person.  Maybe.  At least I was.  Understanding how to use dependency injection and understanding how dependency injection works are two completely different things of which the former is probably more important if you are new to it.

I wrote an introduction to Dependency Injection for Zend Framework a while back and was able to work out some of the kinks in my understanding of how a DI container worked with 4 blog posts.  It took a while for me to get it but much of what I learned for Zend Framework does directly apply for Magento 2.

Dependency Injection is handled in Magento 2 via the class Magento\App\ObjectManager.  And if you look at that from within the context of Magento 2 you are probably pooping your pants.  200+ class instances, configuration options, dogs and cats living together.

But is it really that difficult?  Let’s start with a little sample script.

1
2
3
4
5
6
7
8
use Magento\App\ObjectManager;
require_once 'app/bootstrap.php';
 
class Test {}
 
$om = new ObjectManager();
$test = $om->get('Test');
echo get_class($test);

We define a class called Test and create an instance of the ObjectManager.  We then ask the object manager for an instance of the class test and echo it out.  When we do we get

Test

OK.  How did it know about the class Test?  It didn’t.  You did.  You knew you needed an instance of Test and you simply asked the DI container for one.  It did a little magic on the backend on gave you exactly what you asked for.

So why would you do this instead of  just calling “new”?

Ahh, that is where the fun comes in.  What if your class Test had a dependency on another class called Test2 which has a method “helloWorld” that it requires?  So we will

  1. Create a class called Test2
  2. Add a method to that class called helloWorld()
  3. Declare our dependency in the constructor of the class Test
  4. Define a method Test::getOutput() which calls Test2::helloWorld()
  5. profit
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
use Magento\App\ObjectManager;
require_once 'app/bootstrap.php';
 
class Test2
{
  public function helloWorld()
    {
      return 'hello world';
   }
}
 
class Test {
 
  protected $test2;
 
  public function __construct(Test2 $test2)
  {
    $this->test2 = $test2;
  }
 
  public function getOutput()
  {
    return $this->test2->helloWorld();
  }
}
 
$om = new ObjectManager();
$test = $om->get('Test');
echo $test->getOutput();

When we call this code we get the following output

hello world

“Wait”, you might be thinking.  “Dependency Injection is supposed to be hard!”

Well, it can be, particularly when you need to start configuring the object instances.  However, from a basic perspective what you see here is what Dependency Injection is all about.

We will continue to dive into deeper Dependency Injection in Magento 2 as time goes on, but this is a good place to start with.  If you are not familiar with DI you will probably have a bunch of questions.  We will get to those as time goes on.

[UPDATE]

I was (correctly) notified on Twitter that I needed to be careful not to mix Dependency Injection and a Dependency Injection Container.  Yes, I mixed the concepts; but for good reason.  If you were to look at my code examples and remove the reference to ObjectManager; meaning that you directly injected the dependency you would be using Dependency Injection.  e.g. new Test(new Test2)).  Technically, that is Dependency Injection.  A Dependency Injection Container is when you ask a central object for an instance of a class at IT satisfies the dependencies.

In Magento 2, however, while the two are technical separate concepts they will, in 99.999% (five 9′s) of the scenarios, be married.  That said, if you are going to be building unit tests for your models, and you should, then understanding the distinction is important.  You might use the DIC to contain mocks or you might directly inject the mocks yourself.  I don’t know which way is best at the moment.

Creating a module in Magento 2

This is the first of what I expect will be several blog posts on Magento 2 as  I learn the system.  I believe that one of the best ways to learn something is to write out what it is you are learning.  This forces you to think through the concepts and determine how to explain them to others.  In doing so you are forced to use terms that are familiar to describe this new thing.  This helps to solidify the concepts in your own mind, making it easier to remember.

But still, there are three caveats.  1) Magento 2 is not out yet (as I write this) so things may change.  2) As I write this I am learning it and so there might be inaccuracies. 3) The information gleaned here has not been passed through the core team and so it is my interpretation of what I see.

One of the most visible changes from Magento 1 is the lack of code pools.  No longer is there a core, community and local set of directories.  I liked that there was this separation between different types of code, but it didn’t functionally add anything.  So this is a change that I am ambivalent about.

The next change is that rather than storing all of your module discovery XML files in app/etc/modules they are now stored in the module’s etc directory.  The benefit of this is that the modules are now more self contained than they used to be.  This is a change that I am quite happy about.  And while there is no longer any scope code there still is loading precedence.  The order precedence is base (Magento), custom (Others) and base (You).  This is determined by the location of the module.xml file.  If it is in app/code/Magento… it is Magento.  If module.xml is under any other directory in app/code then it is put into the “custom” scope.  If module.xml exists in a directory under app/etc, such as app/etc/me/module.xml, it is put into the “base” scope.  The names of the scope are irrelevant, but what is important is the order; mage, custom, base.

The parsing of the module.xml files is done by a class called Magento\Config\Dom.  This class does not use SimpleXML like before, but rather uses DOMDocument.  An interesting thing about this change is that it is possible to reference configuration nodes by ID.

However, there is a problem.  In my IDE at least (Zend Studio), DOMDocument is not viewable in the debug variable view whereas SimpleXML was.  I expect that this might make debugging configuration merge errors a little more difficult because I can’t see what the resulting node looks like.  That said, the merged XML document is converted to an array so after the configuration merge is completed you will see what the config looks like.  Additionally, the module nodes are validated against an XSD located in lib/Magento/Module/etc/module.xsd.  Modules are then stored based on dependency and injected into the cache.

This is all done by a class called Magento\App\ObjectManager\ConfigLoader which is injected into a class called Magento\ObjectManager\ObjectManager which is a replacement for Mage::getModel().  That’s not quite true but, like I said about making references to what is known, it is true enough.

Unlike Magento 1 the module list is not stored in the global configuration.  It is stored in the ObjectManager in an object of type Magento\Module\ModuleList.  This means that you will not be querying the configuration for the module list you will need to define the module list as a dependency in your class.

What this also means is that having a config.xml does not “activate” a module, like in Magento 1.  This also hints that you do not need to do that damn /config/models/<name>/class stuff.  There is a LOT of stuff you can do via dependency that will replicate some of that but I will cover that in a later blog post on DI in Magento 2.  Once you get into area-based DI I think you’ll also be getting into some cool stuff.

So let’s create a module that simply echos “Hello World” in the content.

We will start with a module.xml file in app/code/Eschrade/HelloWorld/etc/.

<?xml version="1.0"?>
<config>
  <module name="Eschrade_HelloWorld" version="0.0.1" active="true" />
</config>

There are a few things different.  For one, the options are specified via XML attributes instead of XML nodes.  Secondly, the version is now required.

In an effort to stick to one topic we will output our “Hello World” via the “controller_action_predispatch” event so we don’t have to yet start talking about Invocation Chains, though we will need to talk about DI for a little bit.

To register an event we need to go to our app/code/Eschrade/HelloWorld/etc directory and create a new file called event.xml with the following content.

<?xml version="1.0" encoding="UTF-8"?>
<config>
  <event name="controller_action_predispatch">
    <observer
      name="eschrade_helloworld_echo"
      instance="Eschrade\HelloWorld\Model\Observer"
      method="echoHello" />
  </event>
</config>

Gotta say I really like this new format compared to the old one.  The name is specified via an attribute and the node has a child called observer with attributes name, which is used to give a unique name to the observer, instance, which is the class name of your observer, and the method, which is the method that needs to be called.

The observer is nice and simple.  Put it in app/code/Eschrade/HelloWorld/Model/Observer.php.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
namespace Eschrade\HelloWorld\Model;
 
class Observer
{
  protected $_response;
 
  public function __construct(
    \Magento\App\Response\Http $response
  ) {
    $this->_response = $response;
  }
 
  public function echoHello(\Magento\Event\Observer $observer)
  {
    $this->_response->appendBody('Hello World');
  }
}

In order for us to echo it properly we need to append it to the response object.  Prior to this we would need to call Mage::app()->getResponse->appendBody().  In this case we simple declared, through the constructor, that we needed the response object and placed it in a protected property of the class.  Then when the observer method is called we simply reference the property and get the following output.

Hello World<!doctype html>
<html lang="en" class="no-js">
<head>
<meta charset="utf-8"/>

The underlying implementation to get this working is somewhat complex but I am quite pleased at how easy it was to get this up and running.

How much memory does Magento use?

So, I was teaching the Magento Performance for System Administrators class the other day.  I’ve delivered the class several times but this is the first time that somebody asked me “How much memory does Magento actually use?”  Now, I know what you’re supposed to set it at, but I’ve never measured actual usage.  So I gave some bullcrap answer about how it really depends on a bunch of things and that I really shouldn’t give a precise answer.  But the individual persisted and I was forced to put my tail between my legs and admit that I didn’t know.

So I promised that I would take a look and here are my results.  I’m sure someone will get their panties in a wad about how it doesn’t do this, that, and the other thing. The purpose of these tests were to give a baseline and that’s it. I expect that an actual catalog implementation will have consumption values higher than what is seen here.

The way I measured the memory usage was that I added some code to Mage_Core_Model_App::dispatchEvent() and took memory measurements each time an event was triggered.  I called both memory_get_usage() and memory_get_usage(true).  I also injected a marker at controller_action_predispatch, core_block_abstract_prepare_layout_before,  core_block_abstract_to_html_before, resource_get_tablename, and controller_front_send_response_after.  The reason for that is so that I could get visual cues as to which part of the execution (controller, layout prepare, layout render) was responsible for each.

The Y axis is memory used in MB.  I divided the memory_get_usage() values /1024/1024 to get the the MB used.  Additionally, because the events were used as the sampling point the X axis represents the event count, not elapsed time as your brain might think.

The data that I used was the Magento sample data.  While it is unrealistic to have such a small catalog in Magento it is the only consistent data that is really available.

memory-home-page

The first mark is the start of the request.  The second is controller_action_predispatch.  The second is core_block_abstract_prepare_layout_before.  The third is core_block_abstract_to_html_before.

memory-category-no-images

At this point I had realized that I had not copied the sample media over.  So I re-ran the request with the media.

memory-category-images

There were only 5 products in this category so I wanted to run a page that had pagination.

memory-category-28-9

memory-category-28-30

Obviously there were some images being re-sized here. But actual usage, while higher, was not overly significant.

Then, of course, there is the question of product pages

memory-simple-product-page

memory-configurable-product

There was no real difference between any of the other product types.

Adding to cart was relatively benign.

memory-add-to-cart

As was displaying the cart ( a 302 Found done after adding to cart).

memory-display-cart

While it is a little difficult to read, here is a chart that shows several of these charts in comparison.

memory-all-items

A couple of takeaways from this

  1. Image processing sucks up memory and could allocate more than memory_limit
  2. Layout generation generally uses the most memory.
  3. From a memory perspective, the product type does not seem to do much to memory consumption
  4. From a memory perspective, the number of items in a category does not seem to have much impact in memory consumption
  5. If memory is an issue, layout XML optimizations might be a valid place to look to reduce usage

However, it bears mentioning

  1. This test did not test very large collections
  2. This test did not test very complicated layouts
  3. This test did not test catalog or shopping cart rules
  4. This test did not test performance
  5. And a bunch of other things.

What is Apdex?

Ever since I started using New Relic I’ve been seeing a number for Apdex.  Given that whenever I see a floating point number I presume that the calculation will be too complex for me to understand I just presumed that it was some kind of mystical number voodoo.

Turns out that it is not.  It’s actually really simple.  First of all, New Relic didn’t come up with it at all.  There is an apdex.org website created by a consortium of different companies.  New Relic just did what any good tech company does and used it to their advantage.

In Excel spreadsheet formulas calculating the Apdex is done by taking an arbitrary number that represents your SLA goal, or some number below.  That is your baseline.  If the page response time is below that number a user is considered “Satisfied”.  If the response time is over that number the user is considered “Tolerating”.  If the response time is over that number by a factor of four they are considered “Frustrated”.  In other words if your baseline is 500ms any user below that is satisified, above, tolerating, above baseline * 4, frustrated.

The purpose of this is to get rid of raw response time measurements as the goal.  It, to some degree, gets rid of the 95% rule.

To calculate an Apdex score create an Excel spreadsheet and have the first column be your Satisfied count, the second your Tolerating, the third your Frustrated and apply this formula to it: =(A2 + (B2 / 2))/SUM(A2:C2).  That is the Apdex score.  It is the number of satisfied users plus the number of tolerating users divided by two divided by the sum of all three.

Here.  Let me show you what that looks like.

Satisfied Tolerating Frustrated Apdex
95 12 3 0.918182
76 10 20 0.764151
60 20 50 0.538462
20 50 50 0.375
0 0 100 0

The score is on a scale of 1 (all users satisfied) to zero (all users frustrated).

What is the time scale?  Whatever you choose.  Your Apdex score is calculated based off of whatever time frame you have specified.  Personally, I think a rolling Apdex is a good idea.  But I didn’t really get a proper view of the number until I took these numbers and put them into a multi-axis chart.  The bars are the total count for requests in each of the different categorizations and the line is the Apdex score.

apdex

 

Seeing that corresponding with the raw values helped me to understand what I was looking at for all these months.

Hash value sizes

For giggles, here are examples of hashes for the SHA1, SHA256 and SHA512 hashing mechanisms.

Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
echo hash_hmac(
  'sha1',
  openssl_random_pseudo_bytes(32),
  openssl_random_pseudo_bytes(32)
) . "\n";
 
echo hash_hmac(
  'sha256',
  openssl_random_pseudo_bytes(32),
  openssl_random_pseudo_bytes(32)
) . "\n";
 
echo hash_hmac(
  'sha512',
  openssl_random_pseudo_bytes(32),
   openssl_random_pseudo_bytes(32)
) . "\n";

Output

1
2
3
00666100c04543601c9de450b061b4bbc5538c50
5762b94cd40d3e62c7b343df1ca3511343dc00fd99a4f1ee64988bf523c13b8a
49cae01474cfd4adfa21e94d35dd93a1f808dff4538042d5140fc661773bc8d0019311ee3dcb7ed8e2a27b021ae47c006f9a477fb768f60256276cc99e8c4bd0

Have a good weekend.

More – The file system is slow

A while back I wrote one post on how the overhead of logging was so minimal that the performance impact was well worth the benefits of proper logging.  I also wrote another blog post a while back about how deploying your application in tmpfs or on a RAM drive basically buy you nothing.  I had a conversation the other day by a person I respect (I respect any PHP developer who knows how to use strace) about the cost of file IO.  My assertion has been, and has been for a long time, that file IO is not the boogeyman that it is claimed to be.

So I decided to test a cross between those two posts.  What is the performance cost of writing 1,000,000 log-sized entries onto a physical file system compared to a RAM drive.  As an added bonus I also wanted to show the difference between an open/write/close repeated compared to holding open a file handle and writing the log entries because I think that there is something worth learning there.

The first thing I needed to do was create my RAM drive.  My first test run ran out of disk space so I had to reboot the machine with the kernel parameter ramdisk_size=512000.  This allowed my RAM drive to be up to 512M (or thereabouts).  Then I created my RAM drive.

1
2
3
mke2fs -m 0 /dev/ram0
mkdir -p /ramdrive
mount /dev/ram1 /ramdrive

The code I used to test was the following PHP code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
$physicalLog = '/home/kschroeder/file.log';
$ramLog = '/ramdrive/file.log';
$iterations = 1000000;
$message = 'The Quick Brown Fox Jumped Over The Lazy, oh who really cares, anyway?';
 
$time = microtime(true);
for ($i = 0; $i < $iterations; $i++) {
  file_put_contents($physicalLog, $message, FILE_APPEND);
}
 
echo sprintf("Physical: %s\n", (microtime(true) - $time));
 
$time = microtime(true);
for ($i = 0; $i < $iterations; $i++) {
  file_put_contents($ramLog, $message, FILE_APPEND);
}
 
echo sprintf("RAM: %s\n", (microtime(true) - $time));
 
unlink($physicalLog);
unlink($ramLog);
 
$time = microtime(true);
$fh = fopen($physicalLog, 'w');
for ($i = 0; $i < $iterations; $i++) {
  fwrite($fh, $message);
}
fclose($fh);
 
echo sprintf("Physical (open fh): %s\n", (microtime(true) - $time));
 
$time = microtime(true);
$fh = fopen($ramLog, 'w');
for ($i = 0; $i < $iterations; $i++) {
  fwrite($fh, $message);
}
fclose($fh);
 
echo sprintf("RAM (open fh): %s\n", (microtime(true) - $time));

This code tests each of the four scenarios.  file_put_contents to physical, file_put_contents to RAM, fwrite to physical and fwrite to RAM.  The test was run three times.  The test is a measure of the performance of a really crappy file system compared to RAM.

physical-ram

Y Axis is the number of seconds that it took to write 1,000,000 log entries.  As we can see from the chart the RAM drive provided us some benefit, it was about 12% faster.  But the interesting part is the last two.  The physical write to the file system with an open file handle is about 2.5 times faster than writing to the RAM drive.  The RAM drive, again, outperformed the physical disks by 42%.

But if you compare between the two sets of tests you will notice something interesting.  While the performance difference in the latter test is 42% faster, the wall-clock time is almost the same.  2.1 seconds for file_put_contents, 1.6 for fwrite.  So it would seem that the physical overhead of using the disks, per 1,000,00 log file writes is 0.00000055 seconds.

Now, I KNOW my disks are not that fast, nor is the VM that I ran the tests on.  It is probably due to write caching.  But that is largely irrelevant.  My assertion is that the file system (not necessarily just the disk) is not your enemy.

This is an important distinction.  I ran two dd commands, one to the disk with fdatasync set and one to the RAM drive with fdatasync set.  The results are not surprising. 55MB per second for the disk, 290MB per second to the RAM drive.  But that is not the point.  The operating system, Linux in this case, does a LOT of things to make working with the physical layer as efficient as possible.  Therefore, things like logging or doing other operations on the file system are not necessarily a bad thing because the actual overhead involved is minimal compared with your application logic.

Please feel free to do similar tests and post results.  I would love to see data that contradicts me.  That would make this a much more interesting topic of conversation.  :-)

What SSL $_SERVER variables are available in PHP

I found myself wondering what HTTPS variables were available in the $_SERVER variable today and didn’t find a specific list (and didn’t have mod_ssl installed).  So as a public service, here is what my server says.

array(58) {
["HTTPS"]=>
string(2) “on”
["SSL_VERSION_INTERFACE"]=>
string(13) “mod_ssl/2.2.3″
["SSL_VERSION_LIBRARY"]=>
string(25) “OpenSSL/0.9.8e-fips-rhel5″
["SSL_PROTOCOL"]=>
string(5) “TLSv1″
["SSL_SECURE_RENEG"]=>
string(4) “true”
["SSL_COMPRESS_METHOD"]=>
string(4) “NULL”
["SSL_CIPHER"]=>
string(18) “DHE-RSA-AES256-SHA”
["SSL_CIPHER_EXPORT"]=>
string(5) “false”
["SSL_CIPHER_USEKEYSIZE"]=>
string(3) “256″
["SSL_CIPHER_ALGKEYSIZE"]=>
string(3) “256″
["SSL_CLIENT_VERIFY"]=>
string(4) “NONE”
["SSL_SERVER_M_VERSION"]=>
string(1) “3″
["SSL_SERVER_M_SERIAL"]=>
string(4) “6B5B”
["SSL_SERVER_V_START"]=>
string(24) “Aug 30 13:53:57 2013 GMT”
["SSL_SERVER_V_END"]=>
string(24) “Aug 30 13:53:57 2014 GMT”
["SSL_SERVER_S_DN"]=>
string(139) “/C=–/ST=SomeState/L=SomeCity/O=SomeOrganization/OU=SomeOrganizationalUnit/CN=localhost.localdomain/emailAddress=[email protected]
["SSL_SERVER_S_DN_C"]=>
string(2) “–”
["SSL_SERVER_S_DN_ST"]=>
string(9) “SomeState”
["SSL_SERVER_S_DN_L"]=>
string(8) “SomeCity”
["SSL_SERVER_S_DN_O"]=>
string(16) “SomeOrganization”
["SSL_SERVER_S_DN_OU"]=>
string(22) “SomeOrganizationalUnit”
["SSL_SERVER_S_DN_CN"]=>
string(21) “localhost.localdomain”
["SSL_SERVER_S_DN_Email"]=>
string(26) “[email protected]
["SSL_SERVER_I_DN"]=>
string(139) “/C=–/ST=SomeState/L=SomeCity/O=SomeOrganization/OU=SomeOrganizationalUnit/CN=localhost.localdomain/emailAddress=[email protected]

["SSL_SERVER_I_DN_C"]=>
string(2) “–”
["SSL_SERVER_I_DN_ST"]=>
string(9) “SomeState”
["SSL_SERVER_I_DN_L"]=>
string(8) “SomeCity”
["SSL_SERVER_I_DN_O"]=>
string(16) “SomeOrganization”
["SSL_SERVER_I_DN_OU"]=>
string(22) “SomeOrganizationalUnit”
["SSL_SERVER_I_DN_CN"]=>
string(21) “localhost.localdomain”
["SSL_SERVER_I_DN_Email"]=>
string(26) “[email protected]
["SSL_SERVER_A_KEY"]=>
string(13) “rsaEncryption”
["SSL_SERVER_A_SIG"]=>
string(21) “sha1WithRSAEncryption”
["SSL_SESSION_ID"]=>
string(64) “BE411F57BA97B3C7D61FC07B0DA965B99BF448081CA8C936C2BDE0C320712F3E”
["HTTP_TE"]=>
string(18) “deflate,gzip;q=0.3″
["HTTP_CONNECTION"]=>
string(9) “TE, close”
["HTTP_HOST"]=>
string(9) “localhost”
["HTTP_USER_AGENT"]=>
string(16) “lwp-request/2.07″
["PATH"]=>
string(29) “/sbin:/usr/sbin:/bin:/usr/bin”
["SERVER_SIGNATURE"]=>
string(70) “<address>Apache/2.2.3 (CentOS) Server at localhost Port 443</address>

["SERVER_SOFTWARE"]=>
string(21) “Apache/2.2.3 (CentOS)”
["SERVER_NAME"]=>
string(9) “localhost”
["SERVER_ADDR"]=>
string(9) “127.0.0.1″
["SERVER_PORT"]=>
string(3) “443″
["REMOTE_ADDR"]=>
string(9) “127.0.0.1″
["DOCUMENT_ROOT"]=>
string(13) “/var/www/html”
["SERVER_ADMIN"]=>
string(14) “root@localhost”
["SCRIPT_FILENAME"]=>
string(23) “/var/www/html/index.php”
["REMOTE_PORT"]=>
string(5) “41195″
["GATEWAY_INTERFACE"]=>
string(7) “CGI/1.1″
["SERVER_PROTOCOL"]=>
string(8) “HTTP/1.1″
["REQUEST_METHOD"]=>
string(3) “GET”

["QUERY_STRING"]=>
string(0) “”
["REQUEST_URI"]=>
string(1) “/”
["SCRIPT_NAME"]=>
string(10) “/index.php”
["PHP_SELF"]=>
string(10) “/index.php”
["REQUEST_TIME_FLOAT"]=>
float(1377871511.902)
["REQUEST_TIME"]=>
int(1377871511)
}

How much does logging affect performance?

So, I was having a discussion with a person I respect about logging and they noted that often logging poses a prohibitive cost from a performance perspective.  This seemed a little odd to me and so I decided to run a quick series of benchmarks on my own system.  Following is the code I used.

require_once 'Zend/Loader/Autoloader.php';
require_once 'Zend/Loader.php';
Zend_Loader_Autoloader::getInstance();
$levels = array(
  Zend_Log::EMERG =&gt; 10000,
  Zend_Log::ALERT =&gt; 10000,
  Zend_Log::CRIT =&gt; 10000,
  Zend_Log::ERR =&gt; 10000,
  Zend_Log::WARN =&gt; 10000,
  Zend_Log::NOTICE =&gt; 10000,
  Zend_Log::INFO =&gt; 10000,
  Zend_Log::DEBUG =&gt; 10000
);
echo '&lt;table&gt;';
 
foreach (array_keys($levels) as $priority) {
@unlink('/tmp/log');
$format = '%timestamp% %priorityName% (%priority%): %message%' . PHP_EOL;
$formatter = new Zend_Log_Formatter_Simple($format);
$writer = new Zend_Log_Writer_Stream('/tmp/log');
$writer-&gt;addFilter(new Zend_Log_Filter_Priority($priority));
$writer-&gt;setFormatter($formatter);
$logger = new Zend_Log($writer);
 
$startTime = microtime(true);
 
foreach ($levels as $level =&gt; $count) {
  for ($i = 0; $i &lt; $count; $i++) {
    $logger-&gt;log(
      'Warning: include(Redis.php): failed to open stream: No such file or directory in /var/www/ee1.13/release/lib/Varien/Autoload.php on line 93',
      $level
    );
  }
 
  $endTime = microtime(true);
 
  echo sprintf("&gt;tr&gt;&gt;td&gt;%d&gt;/td&gt;&gt;td&gt;%f&gt;/td&gt;&gt;/tr&gt;\n", $priority, ($endTime - $startTime));
 
}
 
echo '&lt;table&gt;';

What this code does is iterate over each of the different levels of logging 10k times with different levels of priority filtering for a logging message.  So, basically, it will write 80,000 log entries with each iteration doing a different level of logging to see the performance overhead.

logging-overhead-total

You can see the total overhead for each level of logging.  This represents the total elapsed time to log 80,000 log events at the various levels of logging priority.

But nobody is logging 80,000 events (hopefully).  So what does this look like for a realistic approach?  Following is the breakdown based off of the elapsed time for 100 log entries for an individual request.

logging-overhead-x100

 

So, logging seems to cost you a sum total of 1/1000ths of a second per request (assuming 100 log entries).

So this begs the question…

 

3v9tuj

Google finally acknowledges that PHP exists

I read an article today about how PHP is exploding on Google App Engine.  How is it that one of the most despised programming languages in the word is running (as Google claims) up to 75% of the web?  Many nay-sayers will say “oh it’s just WordPress” or “oh, it’s just PHPbb”.  But in doing that they are completely missing the point.

The proper response to this is not trying to dismiss it, but asking why it is that PHP-based applications just seem to always be the ones at the top of the list?  Some may answer that PHP’s ubiquity gives it a default advantage but that still dodges the question.  WHY is PHP running 75% of the web?  Hosters don’t just say “hey, let’s throw X programming language on our web servers!”

It comes down to demand.  There is a lot of demand for PHP.  That’s why hosters put it on their web servers.

In the article Venture Beat says “PHP is moving to the Enterprise very quickly”.  This is not true.  PHP IS in the enterprise and has been for a long time.  People just either don’t know it or refused to admit it.

But, again, we have not answered the question “why”.

Many of the people who are nay-sayers of PHP are the people who have studied.  And in studying they have learned that programming languages need to do certain things in certain ways.  And of these things, PHP does none of them (ok, so this is hyperbole, to a point).  This is a major reason why PHP has such a bad reputation among cutting edge developers, CS grads and trend-setters.

But what it also does is expose the vacuousness of the ivory tower.  The ivory tower deals with validating the theoretical, testing the impractical from within a educational framework or methodology.  People will often say that this approach is a more pure way of approaching the problem rather than the dirty commercially-driven interests of the private world.  To which I say “big frigging deal!”.  Don’t get me wrong, I think that study is good.  Though I didn’t go to university I am under a continuous education program called “reading”, for the theoretical, and “practice” for the practical.  Study is good.  But study is not an end.  Real life occurs and it is not clean, pure and methodological.  What a bore if it were!

But this is real life.  PHP may not solve the problem in the purest of ways; in fact it will probably be pretty dirty.  But that is why it succeeds; it mirrors real life.  In real life you have a job to get done.  And if it takes more resources to do it properly, then the improper method will be used.  Commerce and business, at their most distilled, is simply an efficient means of the utilization and transfer of resources.  Those resources could be money, time, knowledge, or any combination of those or other things.  It is the utilization and transfer of things that have “value”.  And when you have two things that both have worth, purity and practicality, a judgment call needs to be made on which is more valuable.

PHP is valuable not because WordPress is built on it, but because PHP solved the problem WordPress was solving, easier.  In other words, it solved the problem by consuming fewer resources.

Using PHP, I think, is also one of the smarter moves by the company I work for, Magento.  For those who don’t know, Magento is the most popular ecommerce platform in the word and it is written on PHP.  Magento is probably the most complicated application platform available for PHP and it’s STILL easier to build for than most Java web applications with a wider range of programming skills that can be utilized.  In other words, it enables commerce by utilizing fewer resources than competing solutions, but still provides stunning extensibility.

An organization should require as few “top-end” developers for a solution implementation as possible.  When it comes to Magento, WordPress, Joomla, WordPress, etc. you do not require a CS degree to do it.  Rather than being a failure, that is a monumental success!  Scarcity increases cost and so if you can decrease scarcity (developer skill required) you can decrease cost.  And the real world is about doing as much as possible for as little as possible.

So how is it that Google missed PHP?  That is a question that I cannot answer since I don’t work for Google.  But I would surmise that it has something to do with the fact that Google didn’t WANT it there.  For all their posturing about being “data driven” they completely missed PHP despite the fact that they have access to the best web data on the planet.  Therefore I must presume that it’s another iteration of The Invisible Postman; also called “having blinders on”.  Node, Ruby, Python; all great languages and can do some really cool things that PHP cannot.  But they do not solve the problem of resource scarcity on the same level that PHP does, when it comes to web-based applications.

For software companies that are looking to break into the web there is only one language to start with.  As long as HTTP is the de facto protocol of the web, PHP will be its de facto programming language.  Suck up your pride, build your stuff, and be successful.

 

… and let the trolling commence.

Is prevention the best security practice?

I read a post tweeted by Chris Cornutt today.  The basic gist of the article is that your security is only as strong as your most ethically-challenged developer.  That got me thinking that we spend so much time trying to prevent intrusions when detection might be a better priority.  Some tactics, such as SQL Injection, are useful because they protect not just against intruders but people who tend towards single-quote usage as well.  I would argue that SQL Injection is just as much about inadvertent data entry as it is about security.  Same thing with XSS.

But this also got me thinking about laws.  We tend to (wrongly) view laws as a preventative measure.  The problem is that there are always people who are willing to skirt the law, whatever that law may be.  Sometimes it’s because laws are unjust.  But who is to decide when the perceived unjust-ness of a law is sufficient to permit civil disobedience?  Or the rejection of that law by an individual?

But what if we (getting back to developers) worked under the presumption that our code would be attacked and security would be defeated?  If we presume that our software is vulnerable does it make more sense to lock it down as much as we can, or implement methods to detect, or at least collect, information in a way to make prosecution or recovery easy.  Just like you cannot write a law to prevent all people from wrongdoing you cannot guarantee that your code is 100% secure.  Given that, would it work to take an approach that focused more on detection (and recovery) in front of prevention?

Would our approach be different?

What would it look like?

Would it work?

Would it matter?

It may sound a little silly to ask but consider that banks do something like this when it comes to financial transactions.  Banks use eventual consistency to maintain financial records.  They are not ACID compliant.  It is possible to overdraw your account if you do it in a manner that beats out the eventually consistent implementation they use.  It is the only way to maintain the scale that they require.  The position of the banks is that IF a circumstance occurs where there is a discrepancy in bank records it costs them less to fix the issue than to prevent it in the first place.

Likewise, Amazon allows items to be sold when they aren’t sure about stock (just look at a recent purchase of mine).  Their presumption, presumptively, is that it will cost them more to ensure completely accurate inventory management than to send an apology letter to a waiting customer.  Is there a correlation in software development when it comes to security?

I don’t have any answers ATM, and it may be that any implementation may end up being more costly than prevention (my current thought is that it is).  I’m just thinking out loud and wondering if anyone else has given though to this.

Web Analytics