assertTrue( ): Amazon: The Everything Company

Leave a comment

If you’re thinking about putting your company’s Web presence on Amazon’s computers (using EC2), you might want to ask yourself a few questions. Does Amazon already compete with your business in any way? How long before Amazon does compete with you? Do you want to put your online business in the hands of a potential competitor? Do you want Amazon to know more about your business than it already does? No one’s suggesting Amazon is actually going to spy on your business’s bits and bytes (which are already encrypted anyway, right?), but they can learn a lot simply by knowing your capacity needs, your business’s Web traffic patterns, your scale-out and failover strategies. Just by metering your Web usage, they know too much.

via assertTrue( ): Amazon: The Everything Company.


What SSL $_SERVER variables are available in PHP

Leave a comment

I found myself wondering what HTTPS variables were available in the $_SERVER variable today and didn’t find a specific list (and didn’t have mod_ssl installed).  So as a public service, here is what my server says.

array(58) {
[“HTTPS”]=>
string(2) “on”
[“SSL_VERSION_INTERFACE”]=>
string(13) “mod_ssl/2.2.3”
[“SSL_VERSION_LIBRARY”]=>
string(25) “OpenSSL/0.9.8e-fips-rhel5”
[“SSL_PROTOCOL”]=>
string(5) “TLSv1”
[“SSL_SECURE_RENEG”]=>
string(4) “true”
[“SSL_COMPRESS_METHOD”]=>
string(4) “NULL”
[“SSL_CIPHER”]=>
string(18) “DHE-RSA-AES256-SHA”
[“SSL_CIPHER_EXPORT”]=>
string(5) “false”
[“SSL_CIPHER_USEKEYSIZE”]=>
string(3) “256”
[“SSL_CIPHER_ALGKEYSIZE”]=>
string(3) “256”
[“SSL_CLIENT_VERIFY”]=>
string(4) “NONE”
[“SSL_SERVER_M_VERSION”]=>
string(1) “3”
[“SSL_SERVER_M_SERIAL”]=>
string(4) “6B5B”
[“SSL_SERVER_V_START”]=>
string(24) “Aug 30 13:53:57 2013 GMT”
[“SSL_SERVER_V_END”]=>
string(24) “Aug 30 13:53:57 2014 GMT”
[“SSL_SERVER_S_DN”]=>
string(139) “/C=–/ST=SomeState/L=SomeCity/O=SomeOrganization/OU=SomeOrganizational[email protected]main”
[“SSL_SERVER_S_DN_C”]=>
string(2) “–”
[“SSL_SERVER_S_DN_ST”]=>
string(9) “SomeState”
[“SSL_SERVER_S_DN_L”]=>
string(8) “SomeCity”
[“SSL_SERVER_S_DN_O”]=>
string(16) “SomeOrganization”
[“SSL_SERVER_S_DN_OU”]=>
string(22) “SomeOrganizationalUnit”
[“SSL_SERVER_S_DN_CN”]=>
string(21) “localhost.localdomain”
[“SSL_SERVER_S_DN_Email”]=>
string(26) “[email protected]
[“SSL_SERVER_I_DN”]=>
string(139) “/C=–/ST=SomeState/L=SomeCity/O=SomeOrganization/OU=SomeOrganizational[email protected]main”

[“SSL_SERVER_I_DN_C”]=>
string(2) “–”
[“SSL_SERVER_I_DN_ST”]=>
string(9) “SomeState”
[“SSL_SERVER_I_DN_L”]=>
string(8) “SomeCity”
[“SSL_SERVER_I_DN_O”]=>
string(16) “SomeOrganization”
[“SSL_SERVER_I_DN_OU”]=>
string(22) “SomeOrganizationalUnit”
[“SSL_SERVER_I_DN_CN”]=>
string(21) “localhost.localdomain”
[“SSL_SERVER_I_DN_Email”]=>
string(26) “[email protected]
[“SSL_SERVER_A_KEY”]=>
string(13) “rsaEncryption”
[“SSL_SERVER_A_SIG”]=>
string(21) “sha1WithRSAEncryption”
[“SSL_SESSION_ID”]=>
string(64) “BE411F57BA97B3C7D61FC07B0DA965B99BF448081CA8C936C2BDE0C320712F3E”
[“HTTP_TE”]=>
string(18) “deflate,gzip;q=0.3”
[“HTTP_CONNECTION”]=>
string(9) “TE, close”
[“HTTP_HOST”]=>
string(9) “localhost”
[“HTTP_USER_AGENT”]=>
string(16) “lwp-request/2.07”
[“PATH”]=>
string(29) “/sbin:/usr/sbin:/bin:/usr/bin”
[“SERVER_SIGNATURE”]=>
string(70) “<address>Apache/2.2.3 (CentOS) Server at localhost Port 443</address>

[“SERVER_SOFTWARE”]=>
string(21) “Apache/2.2.3 (CentOS)”
[“SERVER_NAME”]=>
string(9) “localhost”
[“SERVER_ADDR”]=>
string(9) “127.0.0.1”
[“SERVER_PORT”]=>
string(3) “443”
[“REMOTE_ADDR”]=>
string(9) “127.0.0.1”
[“DOCUMENT_ROOT”]=>
string(13) “/var/www/html”
[“SERVER_ADMIN”]=>
string(14) “[email protected]
[“SCRIPT_FILENAME”]=>
string(23) “/var/www/html/index.php”
[“REMOTE_PORT”]=>
string(5) “41195”
[“GATEWAY_INTERFACE”]=>
string(7) “CGI/1.1”
[“SERVER_PROTOCOL”]=>
string(8) “HTTP/1.1”
[“REQUEST_METHOD”]=>
string(3) “GET”

[“QUERY_STRING”]=>
string(0) “”
[“REQUEST_URI”]=>
string(1) “/”
[“SCRIPT_NAME”]=>
string(10) “/index.php”
[“PHP_SELF”]=>
string(10) “/index.php”
[“REQUEST_TIME_FLOAT”]=>
float(1377871511.902)
[“REQUEST_TIME”]=>
int(1377871511)
}


Would this be a dumb idea for PHP core?

7 Comments

So, I have been playing around with an idea in my head for a while, a few years now.  It really came along as we started seeing more and more PHP applications rely on bootstrapping.  For me it was as I saw more ZF applications becoming more and more complicated.  At the time I was consulting and I would see significant server resources consumed by bootstrapping the apps.  Loading config files, loading dependent classes, setting up dependencies, initializing ACL’s, and the list goes on and on.

One of the ways to negate the effect would be to cache a bootstrap object and then pull that object from the cache at the start of the request.  However, the problem is that unserialization can actually end up taking more time than the bootstrap process itself.

So, I was wondering.  Perhaps there would be a way to provide a cacheable state of the Zend Engine.

Perhaps it would look something like this.

init_engine_state(Callback $init);

What this would do is call the callback and after the callback returns, but before init_engine_state() returns, the engine would take a snapshot of everything except the superglobals.  This would include classes, objects and opcodes.  The next time a request comes in the callback would not be executed, but the state of the engine would be set to the state that it was in during the previous run of the callback.

Internally, what would happen before init_engine_state() returns is that all of the pertinent hash tables would be copied to a different memory block for the initial request.  Then the next time a request comes in, memory for the copied hashtables would overwrite the existing ones.  As noted earlier this could also include opcodes for files which would mean that the reams of autoloading function calls that typically happen could be completely bypassed.

I have seen legitimate cases where bootstrapping is taking 50% of the wallclock time.  Perhaps by providing an engine hook like this PHP performance could be dramatically improved.

 

… or maybe I’m just speaking out of my buttocks.

 

Please comment.  (On the idea, not my buttocks)

 

[UPDATE]

Here is an additional snippet of code that might help explain what I’m thinking of

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
<?php
 
$app = init_engine_state(function() {
    // This code would only be executed once
    require_once 'lib/code/Application.php';
    require_once 'lib/code/Autolaoder.php';
 
    $app = new Application();
    $app->createAutoloader();
    $app->bootstrap();
    return $app;
    /* 
    * At this point a snapshot would be taken of
    * - opcodes
    * - class definitions
    * - objects
    * 
    */
});
// No autoloading would need to be done
$adapter = Application::getAdapter();
$app->dispatch();

The IBM i Programmer’s Guide to PHP… second edition?

2 Comments

Yep!  PHP is still making strides on the IBM i and people are loving it.  But with the world’s premier book for PHP on the i Series developer now several years old it is time to update it.  So Jeff Olen and I have decided to start work on a second edition.  We will be making some changes but we will mostly be adding new content.  Lots of new content.  Here is a list of things that we’ve come up with.

  • Web services/Mobile
    • JSON
    • REST
    • Mobile interfaces and considerations
  • Language features in PHP 5.3, 5.4 and 5.5 (It will probably be a while before 5.5 is available on the i but we still want to cover it, giving you even MORE value for your money)
  • SOLID principles
  • Expand OO
    • Basics
    • Advanced
  • Standards
    • PSR*
    • Autoloading
  • Beginning Test Driven Development
  • New Toolkit
  • Security

I’ve actually started writing one of the chapters this morning, BUT!  If you are an IBM i developer and there are topics that you would like to have covered we would love to hear from you.  You can either comment below or email me at [email protected] (Yes, I’ll share anything with Jeff 🙂 ).


Setting max_input_time (with data!)

Leave a comment

I asked a question on Twitter on why some of the recommend max_input_time settings seem to be ridiculously large.  Some of the defaults I’ve seen have been upwards of 60 seconds.  However, after thinking about it I was a little confused as to why a C program (i.e. PHP) would take so long to process string input.

The reason I was thinking about this was because I was thinking about ways to protect PHP from denial of service attacks.  Having timeouts longer than necessary can exacerbate service availability problems and while I received some responses, those responses did not contain data.

So I decided to get some data.

I ran the test on a local quad core VM with about a 1G of memory.  So clearly I wasn’t going to be pushing a lot of data through.  But it would be enough to figure out what a typical PHP response would need.

I wrote a little test script using the ZF2 HTTP client which would simulate uploading a file and gathered elapsed time for sending the request.  I changed it to measure both read time and full request time.  Read time would only test from when the response had been written to the network to getting data back.  Since there was no data coming back that should only have a small impact on the HTTP processing time.

The script I used was this

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
use Zend\Uri\Http;
 
use Zend\Http\Client;
use Zend\Loader\StandardAutoloader;
require_once 'Zend/Loader/StandardAutoloader.php';
$loader = new StandardAutoloader();
$loader->registerNamespace('Zend', __DIR__ . '/Zend');
$loader->register();
 
class HttpClient extends Client
{
  public $elapsed;
 
  protected function doRequest(Http $uri, $method, $secure = false, $headers = array(), $body = '') {
 
    $this->adapter->connect($uri->getHost(), $uri->getPort(), $secure);
 
    if ($this->config['outputstream']) {
      if ($this->adapter instanceof Client\Adapter\StreamInterface) {
        $stream = $this->openTempStream();
        $this->adapter->setOutputStream($stream);
      } else {
        throw new Exception\RuntimeException('Adapter does not support streaming');
      }
    }
 
    // HTTP connection
    $startTime = microtime(true);
 
    $this->lastRawRequest = $this->adapter->write($method,
    $uri, $this->config['httpversion'], $headers, $body);
 
    $result = $this->adapter->read();
    $this->elapsed = microtime(true) - $startTime;
    return $result;
  }
}
 
for ($i = 0; $i < 200; $i += 20) {
//for ($i = 1; $i < 20; $i += 10) {
$client = new HttpClient();
$client->setUri('http://192.168.0.248/');
$client->setMethod('POST');
$client->setFileUpload('test.txt', 'somename', str_repeat('a', 1024 * 1024 * $i));
$client->send();
 
echo $i . 'MB took ' . $client->elapsed . "\n" ;
}

The read results time for multiple files was

0MB took 0.6802020072937
20MB took 0.2431800365448
40MB took 0.015140056610107
60MB took 0.018751859664917
80MB took 0.02366304397583
100MB took 0.027199983596802
120MB took 0.18756008148193
140MB took 0.58918190002441
160MB took 0.62950801849365
180MB took 0.47761011123657

The full response times for each were

0MB took 0.047544956207275
20MB took 0.10768604278564
40MB took 0.18601298332214
60MB took 0.27659296989441
80MB took 1.966460943222
100MB took 0.4365668296814
120MB took 1.0387809276581
140MB took 0.75083804130554
160MB took 1.340390920639
180MB took 1.0809261798859

But most PHP requests are not file uploads, but URL encoded form files.  So let’s see what happens when we change the data being sent to a form submission.

1MB took 0.048841953277588
11MB took 0.32986307144165
21MB took 0.59214305877686
31MB took 0.66419100761414
41MB took 0.72057294845581
51MB took 0.76613712310791
61MB took 0.82655096054077
71MB took 0.91010904312134
81MB took 0.95742678642273
91MB took 0.99846816062927
101MB took 0.89947819709778
111MB took 0.72254300117493
121MB took 1.5053050518036
131MB took 6.4079310894012
141MB took 8.9290759563446

I stopped the test there because the system started swapping.

*note* as you can tell from the times there was a lot of entropy on the system causing significant variations in response time.  You can expect a system under load to have similar variations.

So there are a couple of things we learned here.

  1. If your system does simple HTTP requests (no file uploads or crazy form sizes) 1 second should be sufficient, except if you are under significant load
  2. multipart/form-data processing seems to be MUCH more efficient than url-encoding from a memory usage standpoint (I was not expecting this)

*note* if you’re wondering why the second batch started at 1MB it’s because of this change in the testing code

1
2
3
4
5
6
$client->setParameterPost(
array(
'test1' => str_repeat('a', 1024 * 1024 * ($i / 2)),
'test2' => str_repeat('b', 1024 * 1024 * ($i / 2))
)
);

Clearly I could not start at zero.


Why is FastCGI /w Nginx so much faster than Apache /w mod_php?

90 Comments

I have a new post on using Jetty with PHP-FPM that, if you think this is interesting, you should check that one out.

(this post has a sister post on Apache’s event MPM compared to Nginx)

I was originally going to write a blog post about why NginX with FastCGI was faster than Apache with mod_php.  I had heard a while ago that NginX running PHP via FastCGI was faster than Apache with mod_php and have heard people swear up and down that it was true.  I did a quick test on it a while back and found some corresponding evidence.

Today I wanted to examine it more in depth and see if I could get some good numbers on why this was the case.  The problem was that I couldn’t.  IIRC, it was for a Magento installation.

To test I did a simple “hello, world” script.  Why something simple?  Because once you’re in the PHP interpreter there should be no difference in performance.  So why not just do a blank page?  It’s because I wanted to have some kind of bi-directional communication.  The intent was to test the throughput of the web-server, not PHP.  So I wanted to be spending as little time in PHP as possible but still test the data transmission.

The baseline tests show the following.

Apache w/ mod_php

 Total transferred: 3470000 bytes
 HTML transferred: 120000 bytes
 Requests per second: 2395.73 [#/sec] (mean)
 Time per request: 4.174 [ms] (mean)
 Time per request: 0.417 [ms] (mean, across all concurrent requests)
 Transfer rate: 811.67 [Kbytes/sec] received

NginX with PHP-FPM

 Total transferred: 1590000 bytes
 HTML transferred: 120000 bytes
 Requests per second: 5166.39 [#/sec] (mean)
 Time per request: 1.936 [ms] (mean)
 Time per request: 0.194 [ms] (mean, across all concurrent requests)
 Transfer rate: 801.82 [Kbytes/sec] received

Apache was able to dish out 2400 requests per second compared with 5200 requests per second on NginX.  That was more than I had seen before and so  I did an strace -c -f on Apache to see what came up.  -c shows cumulative time on system calls, -f follows forks.  The result for the top 10?

% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
 33.65 0.042063 4 10003 getcwd
 16.10 0.020127 2 10001 writev
 16.00 0.019994 2 10001 shutdown
 10.54 0.013179 0 51836 40118 open
 9.01 0.011263 1 20008 semop
 5.22 0.006522 0 54507 10002 read
 2.53 0.003158 0 10024 write
 1.91 0.002386 0 88260 66483 close
 1.57 0.001959 245 8 clone
 1.16 0.001455 0 54832 384 stat64

getcwd?  Why?  Then I remembered that I had AllowOverride (.htaccess) turned on.  So I re-ran the test with AllowOverride set to None.

Total transferred: 3470000 bytes
HTML transferred: 120000 bytes
Requests per second: 5352.41 [#/sec] (mean)
Time per request: 1.868 [ms] (mean)
Time per request: 0.187 [ms] (mean, across all concurrent requests)
Transfer rate: 1813.40 [Kbytes/sec] received

At 5352 requests per second Apache actually was outperforming NginX.  But what about if more data was transferred?  So I created about 100k of content and tested again.

Apache

Total transferred: 1051720000 bytes
HTML transferred: 1048570000 bytes
Requests per second: 2470.24 [#/sec] (mean)
Time per request: 4.048 [ms] (mean)
Time per request: 0.405 [ms] (mean, across all concurrent requests)
Transfer rate: 253710.79 [Kbytes/sec] received

NginX

Total transferred: 1050040000 bytes
HTML transferred: 1048570000 bytes
Requests per second: 2111.08 [#/sec] (mean)
Time per request: 4.737 [ms] (mean)
Time per request: 0.474 [ms] (mean, across all concurrent requests)
Transfer rate: 216476.53 [Kbytes/sec] received

This time the difference was even greater.  This all makes sense.  mod_php has PHP embedded in Apache and so it should be faster.  If you’re running only PHP on a web server then Apache still seems to be your best bet for performance.  And if you are seeing a significant performance difference then you should check if AllowOverride is turned on.  If it is, try moving that into httpd.conf and try again.

If you are running mixed content, such as adding CSS, JS and images, then NginX will provide better overall performance but it will not run PHP any faster.  It will also respond better to denial of service attacks better, but a CDN is generally better at mitigating that risk.

But if you are running pure PHP content on a given server, Apache seems to still be the best bet for the job.

[UPDATED]

Here’s a chart of the throughput difference

apache-vs-nginx


What do you hate about being a programmer?

5 Comments

I have my list?  What about you?

My list?

  1. I don’t like solving problems that aren’t related to what I’m doing
  2. I don’t like waiting for stuff
  3. I don’t like it when things don’t work
  4. I don’t like egotistical ass-wipes.  Egotists are OK, since I’m one.  But not the ass-wipes.  They just stink.
  5. I don’t like workarounds
  6. I don’t like failures that don’t give me a clue about what went on
  7. I don’t like API’s, frameworks or other things that are either too simple or too complex.  I like my tooling like I like my porridge.
  8. I would like services (DB, messaging, email, RPC) to be available on demand and with their state maintained if I don’t need them for a time.

Anything you disagree with?  Do you have a list of your own?


On Discovery

Leave a comment

“When Alexander saw the breadth of his domain, he wept for there were no more worlds to conquer.” Hans Gruber

Perhaps I am more contemplative when I wake at 3:00 and can’t get back to sleep.  I was up for a good reason, feeding a beautiful little girl my wife and I are fostering (I was feeding from a bottle, in case anyone is confused) but those nights tend to make for strange thoughts during the day… more-so than normal.

I was playing around on a virtual machine this morning on a project I’m working on and I was checking some of the log files.  I forgot exactly what wtmp had in it so I manned it and saw this:

#if __WORDSIZE == 64 && defined __WORDSIZE_COMPAT32
  int32_t ut_session;         /* Session ID, used for windowing */
  struct {
    int32_t tv_sec;           /* Seconds */
    int32_t tv_usec;          /* Microseconds */
  } ut_tv;                    /* Time entry was made */
#else
  long int ut_session;        /* Session ID, used for windowing */
  struct timeval ut_tv;       /* Time entry was made */
#endif

I have no idea why, but for some reason the “long int ut_session” reminded me of the excitement I felt when I first started playing with Linux.  I bought my first Linux book more for the CD’s because downloading a distro was not feasible in those days.  The book was Using Linux – Third Edition by Jack Tackett Jr. and David Gunter, which I still have.  It included Red Hat, Slackware and Caldera.  To give you an idea of how old it was, there is no mention of DHCP and yum was just a twinkle in Seth Vidal’s eye.  But even though it was the early days of Linux it is still an 800 page book.

When I first got Linux installed on some crap computer I had I was completely enthralled at what I saw.  I don’t recall ever being that excited about DOS.  Maybe Windows 95 when it first came out, but that quickly waned.  I played with Linux for weeks.  Because I could look under the hood I became immersed in how an operating system works.  I never got a CS degree but I for damn sure learned a lot about computers.  It was like having my own personal Renaissance.

I bring up the quote from Hans Gruber because I’m trying to think of something that has spawned the same excitement about learning something new since then.  The closest I can come up with is my chapter on structured files in You want to do WHAT with PHP?  The experimentation I had to go through to get PHP to read an ext2 file system stoked the embers of an explorer’s mind.  It wasn’t new, but gaining the understanding necessary to access the raw file system was kind of exhilarating, though not quite at the same level.

Lest you take this to understand that I believe I know it all, you would be mistaken.  I know less than I know more, but that which I don’t know is less exciting than when I first learned what I know.  Though Alexander may have wept because there was no more a-conquerin’ to do, most of the world was still far from his grasp.

Think of it.  Cloud computing, what we used to call virtualization, was invented by IBM in the 1960’s with the System/360.  Distributed computing had its genesis in the Jazz Age of computing with pocket protectors in place of Art Deco design.  We’re still using the same networking protocol that was made live in 1983.  Granted, things are faster and we have slicker UI’s, but not much of the fundamentals has been radically altered.

And they probably shouldn’t be altered since very few of the actual problems have changed.  They’ve just changed in terms of scale, in terms of speed and in terms of service expectations.

So it makes me ask the question.  Is there anything that excites you as much as it did when you first got started in computers?  If so, what is it and why does it excite you?

 

This reminds me.  I need to watch Die Hard again.