Category Archives: Random

I’m curious about what you build

What do you spend your time on?  What types of applications do you work on?  I’m not so much curious about the frameworks you use, or your production environment.  What you do, not what you use, is what I’m interested in.  Do you build a blog, a CMS, an order entry system, a social media platform?  What is it that you work on.  If you are so inclined to share, please post a comment.

Implementing asynchronous functionality in Magento

ECommerce is a small thing, right?  Nobody’s doing it and it’s so simple that everyone who does it is doing it right.  When that Cyber Monday hits, nobody panics; sites stay up, they’re able to handle the load and nobody gets yelled at, right?

OK, maybe 20 years ago.

PHP eCommerce had humble beginnings.  Very humble beginnings.  And when those humble beginnings started to show there was a company that seized upon that opportunity.  The result is an eCommerce platform called Magento.  If you are reading this, it is likely that you know about Magento.  Maybe you use it, maybe you don’t.  But you probably have an opinion.  Whether you like the software or not that is the sign of a strong ecosystem.

I was asked to submit for the Magento Imagine conference in Los Angeles.  The topic of my talk was based off of a talk I did at ZendCon 2010 called “Do You Queue?”  That talk was about some general scalability considerations along with an example of a library that I wrote which allows you to utilize the Zend Server Job Queue to do easy asynchronous execution.  The talk I gave at Imagine was the same talk.

Except that it required a lot more code.  The reason for this is because I took the code that I wrote for ZendCon and created an abstraction layer that directly integrated with Magento without any changes in the core code.  What this means is that ANYONE who had any need for asynchronous execution (doing stuff outside of the inline code) can take that code and bake it into their own Magento installation.

That code is available on GitHub, the links for which I will provide in a moment.

There are three extensions (and a fourth library that they are based off of) that I wrote which can take you from simply implementing this asynchronous processing to actually doing Ajax-based payment processing.

On a very simplified level, the way the Zend Server Job Queue works is that you can tell the Job Queue to execute a URL asynchronously from the source request.  In other words if you have something that needs to execute some complex, or long running code, you can do so by simply calling a URL where that logic resides.

Which is cool, but I prefer more elegant constructions.  Maybe, just maybe, there’s a shortcut (and a gold star for you if you get the movie reference).

What I did was build a library that allows you to take this URL-based approach and change it to an object-based approach.  There are a couple of classes to be aware of, which are all part of the library which you can download from GitHub (https://github.com/kschroeder/ZendServer-JobQueue-Job-API).  They are based on PHP 5.3.

  • Manager 
    • Handles connecting to the queue and passing results back and forth
  • JobAbstract 
    • Abstract class that a job would be based off of
  • Response 
    • The response from the manager when a job is queued.  Contains the server name and job number

The only two classes you need to be concerned about is the JobAbstract and Response.  Any job needs to extend the JobAbstract class.  This is sort of a “gateway” class that both takes the input and provides the result of the job, typically via a getter and a setter.  To see an example of this, download the Job API code and look at the class in the folder jobs/org/eschrade/job/GetRemoteLinks.php.

To execute a job, simple instantiate the job, provide whatever data it needs and call the execute method.

1
2
3
4
5
6
use orgeschradejobGetRemoteLinks;
$job = new GetRemoteLinks(
      'http://www.eschrade.com/'
);
 
$response = $job->execute();

 

That method returns a response which provides the server name and job ID. It is used later on when you check to see if the job has completed.  That is done with some very simple code as well.

1
2
3
4
5
6
use com\zend\jobqueue\Manager;
 
$mgr = new Manager();
if (($job = $mgr->getCompletedJob($response)) !== null) {
     // do stuff
}

 

If the manager returns a null value it means that the job has not finished executing.  If it has completed then it will return the instance of the original object back to you so you can use it to retrieve the results.

I don’t want to spend too much time on the details of this so you can see this working by downloading Zend Server, and getting a 30 day trial license and downloading the library code.  It should be a very quick install for you so you can see it working.  On Linux machines it should work out of the box, though Windows machines may require you to set a named queue (zend_jobqueue.named_queues) which matches your hostname to the value of zend_jobqueue.default_binding.  In my case, the value is LAP-KEVIN:tcp://127.0.0.1:10085 for zend_jobqueue.named_queues.

That whole introduction is to bring you to a place where you can get a minimal view of how the Magento extension I built works.  I would recommend understanding how the base code works before diving into the Magento portion.

There are two primary Magento extensions that I built that utilize this.  The first is the abstraction layer that implements my prior job API.  The second is an example, called Async_Payment, which intercepts payment requests and does them asynchronously.

The Job Queue layer is an extension called Zendserver_Jobqueue and is available on GitHub (https://github.com/kschroeder/Magento-ZendServer-JobQueue).  Once installed it will require you to provide an entry point URL for the location where the jobs will actually execute.  This can either be the local machine, a remote machine or a load balancer.  It is set in the regular configuration GUI in Magento.  My value is http://mage.local/jobqueue/, since it uses the regular router.  If you have custom routing you may need to change that.  The URL needs to call Zendserver_Jobqueue_IndexController::indexAction() which is where the Job Queue manager is invoked.

If you look in the controller code you will also see a quick example that shows how this works.  There is a sample job that is provided called Zendserver_Jobqueue_Job_Nots.  What it does take a Boolean value and nots it, providing the result for later.  The job extends Zendserver_Jobqueue_JobAbstract which, in turn, extends comzendjobqueueJobAbstract.

As a side note, as of the version of Magento that I had when writing this, which was off of the 1.5 development brach, did not support PHP 5.3 namespaces so I needed to build a mechanism that included the Zend Framework autoloader, which does.  My understanding is that this is an issue that will be fixed shortly.

The next extension is the one called Async_Payment.  What it does is use an observer to redirect payment requests to the controller in Async_Payment.  The way this is done is via configuration under the ASYNCHRONOUS PAYMENT category.  That shows the different payment methods, but adds another tab called Asynchronous Settings.  The setting here allows you to turn the asynchronous processing on and off (handled in the observer).  What you need to do to make it work is give it the view templates to watch for.  When it sees that one of those templates (comma separated) is being rendered it appends some JavaScript that overwrites some of the functionality of the one-click checkout method to redirect the payment request to the Async_Payment controller.  Still following me?  My value is checkout/onepage.phtml.  So when that view is being rendered the extension will know that it needs to inject some JavaScript into the view to take hold of the payment request.

The final payment request is redirected to the Async_Payment_IndexController class.  What it does is take the data being submitted, which is exactly the same as the normal payment request, and passes it into a job, which is then executed in Async_Payment_IndexController::taskexecAction().  Then the browser will call Async_Payment_IndexController::oneclickpingAction() to check the queue manager to see if the asynchronous payment has been completed.

The asynchronous payment job is actually quite simple.  It pretends to be a browser and does an HTTP request to the original payment URL and returns the result.  Then the next time the browser calls oneclickpingAction() the raw result is returned to the browser, interpreting it as it would have been a normal request and you’re on your way.

Where to go from here?  First; download Zed Server.  There’s a 30 day trial license that you can use to try this stuff out.  Second; download the simplified Job Queue library.   Run the unit tests and debug the code.  That’s the best way for you to understand what’s actually going on.  After that, download and install the Magento extensions.  I think it is critical to work in this order, especially if you’re a coder.  Jumping straight into the Magento extensions will probably end up confusing you without the basic job queuing mechanism being properly understood.

Have fun, and drop me a line on email (kevin @ zend) or on Twitter.

Objections to dynamic typing

I am about to head out to Magento Imagine to speak on queuing and scalability.  So what is today’s blog post about?  Dynamic typing; which has absolutely nothing to do with scalability.

Every once in a while I inject my opinions into places where they are not welcome.  I have heard from people in the staticly-typed realm of how amateur dynamic typing is.  Some people are interested in understanding how to use dynamic typing, others, not so much. So what I would like to do is talk about some of the arguements made against dynamic typing.  Clearly PHP will be my reference point, but many of my points will be salient across many dynamically typed languages.

The biggest misconception about PHP is that it is a strictly dynamicly typed language.  In other words that cannot have typed variables.  Where you are using the OOP mechanisms in PHP, you have the opportunity to strictly type your variables.

1
2
3
4
5
6
7
8
9
10
11
12
class Test {}
class Test2 {}
 
class ExecuteTest {
 public function exec(Test $test)
 {
 doSomethingWithTest($test);
 }
}
 
$et = new ExecuteTest();
$et->exec(new Test2());

What happens when this code gets compiled?

Catchable fatal error: Argument 1 passed to ExecuteTest::exec() must be an instance of Test, instance of Test2 given, called in test.php on line 17 and defined in test.php on line 9

Fatal error.  This is because the type of object passed in was incorrect.  So data types do exist in PHP and many other languages.  The only downside is that you need to actually run the code on your web server or in a unit test to compile it.  Some would (and have argued extensively) that this is a significant drawback.  There’s truth to that, but on a very limited scope.  Is it a drawback?  Yes.  Is it signficant?  Not by a long shot.  Whether it’s PHP, Java, C, Perl, Ruby, VB, C#, JavaScript, etc. etc, if you deploy code that you haven’t tested then you deserve every error and every sleepless night you get.  It’s called being responsible for your code.  And don’t think that having your code pre-compiled is much better.  I have a lot of compiled applications running on my computer.  Cakewalk SONAR, Firefox, Apache, PHP (the binaries), MySQL, Tweetdeck, Java, etc., etc.  And you know what?  Shit still happens with compiled code! Sometimes even type-related errors!  Compiling your code ahead of time as you do with C, Java, and the like does not protect you from type-based errors.  Can you catch some fat-fingered errors?  Sure.  Are you safe?  No.

For example, take this Java code

 

1
2
3
System.out.print(
     Integer.MAX_VALUE
);

 

Running it provides an output of

2147483647

What about this code?

 

1
2
3
System.out.print(
     Integer.MAX_VALUE + 1
);

 

The output is 2147483648, correct?  No.  It is

-2147483648

Did the compiler flag that?  No.  Is that a weakness in Java?  No.

That was a bit of a side-track, but it highlights what I think is a misconception about PHP applications and structure, usually made by those who don’t do significant work in PHP.  Do you HAVE to build a well-structured application?  No.  Should you?  Of course.  PHP does not force you to write bad code.

Often those who complain about PHP will complain not about what PHP developers DO, but about what they COULD do.

But let me address what I think is probably the biggest complaint from the static-typed crowd.  The biggest problem is not the programming language, but the user sitting at the interface.  The problem is that if you have data entered for a field called “width” and the person on the other end enters “wiener” the problem isn’t that “wiener” evaluates to zero, it’s that you didn’t validate your data.  This problem exists for C, Java, or whatever your compiled language is.  Why?  Because if you are building a web site you do not have the ability to force typed input.  Why? HTTP is ALL strings.  There are no integers, there are no floats, there are no booleans.  They are all strings, regardless of the language you are working in.  And there are plenty of validators in the core PHP language to use (ctype_*) and plenty of frameworks that offer more advanced validators.  If you aren’t validating input in ANY language you will have problems.

The next issue is with API code.  I have no problem conceding that if I am writing an extremely math based application that PHP is likely not a good choice.  So, 3d applications, music/audio applications, games, etc.  Going back to the previous paragraph, PHP was written for HTTP.  HTTP uses strings.  But if you’re building an API or structuring an application, how do you make sure that other programmers don’t provide data that royally messes up your code.  Conceptually, I will concede that this could be a problem.

But go back a few paragraphs.  I made the claim that people complain not about what PHP developers do, but what they could do.  The example I saw was

 

1
$square->setSize("hello");

 

setSize(int), in a static-typed language, will fail.  Could you put “hello” into a method call named setSize() in PHP?  Yes.  Next question; would you?  If you are worried that someone will put the word “hello” into a method called setSize() then you should be wondering about the basic cognitive abilities of the programmer, not the abilities of the language.  We’re not even talking about using something like Hungarian notation or anything like that.  We’re talking about simple logic.  Does “hello” make sense in that context?

OK, so that example was on the simplistic side.  What about when you are dealing with complex third party applications?  In all honesty, I don’t know how to quantify that.  If there’s a method called setPrice() do I put a description in there?  I suppose that if you are using a log of magic in your application that there could be a benefit, but the whole complaint is that static typing is safer than dynamic typing.  In that case

 

1
2
3
System.out.print(
    Integer.parseInt("Dubious data")
);

 

should be a nonexistant issue.  But it’s not, because invalid data was passed.  These kinds of games can go on and on but they miss the point becuase these are things that the language cannot check against.  If you intend to leave your application open with no validation either at the front of your application or in your API then you will have problems.  It doesn’t matter if it’s typed or dynamic.  What matters more than the type is the value, as we saw in the Integer.MAX_VALUE + 1 example.  So even when writing or writing to a third party API you still need to validate data.  As such, the gain of typing is not as significant as is sometimes claimed.

I’ve talked so far about objections to dynamic typing.  What about the pros?  This is a short argument because there is one big reason why dynamic typing is good, and it is quick to explain.

I think the biggest pro of working with a dynamically typed language as that you go from working with data/types to working simply with data.  In other words, do you really care about whether or not 1 is a string “1″ or an integer 1?  No.  You care that it’s a 1 up until you are doing math.  Then at that point the interpreter works on that data as if it were a number.  When you print that number in a browser do you care if it’s a numerical 1 or a string “1″.  No.  You care that the user sees the number 1 on their screen.  In that case it’s a string.  Data; that is what applications are about.  Data.  Applications are not about data types.  Data typing for scalar data can help to mitigate some issues but it is by no means fool-proof.  In fact, a lot of security issues are due to some very smart “foolish” people messing with input.  Data types do not protect you against anything.  In Java, you get bad data, in C you get a buffer overflow.

Data.  That is what we are working with.  My purpose in writing this is not to say that dynamic typing is better than static typing, or anything like that.  Types are ways of describing data, but they do not protect you against anything.  Statically typed languages will protect you in some cases, but leave you wide open in others.

So why is dynamic typing sometimes listed as an objection.  I believe that’s exactly what it is; an objection.  Objections are simply reasons that people find to not use your stuff and are usually done to put you on the defensive.  It’s standard fare for if you’re selling kitty litter or a programming language. A lot of people hate PHP, for reasons I have described before.  My interpretation is that dynamic typing is raised as an issue mostly because people see attacks on their preferred language as an attack on themselves.  Honestly, I can think of no other reason why people get so engaged in tech-wars the way they do.

ZendCon 2010 Podcast – Pragmatic Guide to Git

Speaker

Travis Swicegood

Abstract

Git is hard; at least if you listen to the naysayers. Actually, you need to know a handful of commands to navigate Git successfully. This talk demystifies Git. Once we're finished you'll know everything you need to start using Git in your day-to-day projects and collaboratively with other developers..

Licensing:

The ZendCon Sessions are distributed under a creative commons Attribution-Noncommercial-No Derivative Works 3.0 License, Please honor this license and the rights of our authors.


Podcast Download

Download as MP3

Play Inline

ZendCon 2010 Podcast – Unit Testing in Zend Framework 1.8

Speaker

Michelangelo van Dam

Abstract

Zend Framework 1.8 has improved and simplified how you can test your applications, providing you with excellent techniques to streamline your quality assurance processes and reduce your maintenance costs.

Licensing:

The ZendCon Sessions are distributed under a creative commons Attribution-Noncommercial-No Derivative Works 3.0 License, Please honor this license and the rights of our authors.


Slides

View Slides (SlideShare)

Podcast Download

Download as MP3

Play Inline

ZendCon 2010 Podcast – Introducing Zend Framework 2.0

Speaker

Ralph Schindler (Penn) and Matthew Weier O'Phinney (Teller)

Abstract

Zend Framework has grown tremendously since the first public preview release in March 2006. Originally a slim, MVC framework with a number of standalone components, it has grown to a codebase more than 2M lines of code. Work now turns to version 2, with goals of increased simplicity and advanced PHP 5.3 usage.

Licensing:

The ZendCon Sessions are distributed under a creative commons Attribution-Noncommercial-No Derivative Works 3.0 License, Please honor this license and the rights of our authors.


Slides

View Slides (SlideShare)

Podcast Download

Download as MP3

Play Inline

ZendCon 2010 Podcast – Do You Queue?

Speaker

Kevin Schroeder

Abstract

There has been a lot of talk over the past several years about the difference between performance and scalability. When talking about building a scalable application queuing is a concept that many PHP developers are not overly familiar with. In this talk we will demonstrate how you can use the Zend Server Job Queue to scale your application.

Licensing:

The ZendCon Sessions are distributed under a creative commons Attribution-Noncommercial-No Derivative Works 3.0 License, Please honor this license and the rights of our authors.


Slides

View Slides (SlideShare)

Podcast Download

Download as MP3

Play Inline

Subnet validation with Zend Framework

(Note – I accidentally gave conflicting instructions to the person who runs our newsletter.  If you are actually interested in the article I wrote about people being silly about dynamicly typed languages you can go here)

I saw on a StackOverflow posting, someone was asking to see how you could use a Zend Framework validator to tell if an IP address was between two addresses.  The individual was trying to use Zend_Validate_Between to do the checking.  However, IP addresses generally are not checked between two arbitrary addresses such as between 192.168.0.45 and 192.168.0.60.  Instead, the check is usually done to validate an IP address against a subnet.

So, assuming that the individual was actually asking about subnet validation, and seeing that I couldn’t find a subnet validator for Zend Framework, I wrote a quick one.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
<?php
class Esc_Validate_Subnet extends Zend_Validate_Abstract
{
    const NOT_ON_SUBNET = 'not-on-subnet';
    private $_subnet;
    private $_netmask;
    protected $_messageTemplates = array(
        self::NOT_ON_SUBNET => "'%value%' is not on the subnet"
    );
 
    public function __construct($subnet, $netmask = null)
    {
        if ($netmask === null && strpos($subnet, '/') === false) {
            throw new Zend_Validate_Exception(
                 'If the netmask is not specified then the CIDR (e.g. /24) must be provided');
        }
        if ($netmask === null) {
            $this->_subnet    = ip2long(substr($subnet, 0, strpos($subnet, '/')));
            $cidr = substr($subnet, strpos($subnet, '/') + 1);
            // /0 is used to denote the default route.  Since we're not doing routing,
            // we will use it for error checking
            if ($cidr < 1 || $cidr > 32) {
                throw new Zend_Validate_Exception('Invalid CIDR specified');
            }
            $this->_netmask = -1 << (32 - (int)$cidr);
 
        } else {
            $this->_subnet    = ip2long($subnet);
            $this->_netmask = ip2long($netmask);
        }
 
    }
 
    public function isValid ($value)
    {
        $this->_setValue($value);
        $host = ip2long($value);
 
        $check1 = $host & $this->_netmask;
        $check2 = $this->_subnet & $this->_netmask;
 
        if ($check1 == $check2) {
            return true;
        }
        $this->_error(self::NOT_ON_SUBNET);
        return false;
    }
}

 

Here is the Unit Test case

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
assertTrue($sn->isValid('192.168.0.2'));
    }
 
    public function testValidSubnetWithCIDR()
    {
        $sn = new Esc_Validate_Subnet('192.168.0.1/24');
        $this->assertTrue($sn->isValid('192.168.0.2'));
    }
 
    public function testInvalidSubnetAndNetmask()
    {
        $sn = new Esc_Validate_Subnet('192.168.1.1', '255.255.255.0');
        $this->assertFalse($sn->isValid('192.168.0.2'));
    }
 
    public function testInvalidSubnetWithCIDR()
    {
        $sn = new Esc_Validate_Subnet('192.168.1.1/24');
        $this->assertFalse($sn->isValid('192.168.0.2'));
    }
 
    /**
     * @expectedException Zend_Validate_Exception
     */
 
    public function testMalformedCIDRBadSeperator()
    {
        $sn = new Esc_Validate_Subnet('192.168.0.1_24');
        $this->assertTrue($sn->isValid('192.168.0.2'));
    }
 
    /**
     * @expectedException Zend_Validate_Exception
     */
 
    public function testMalformedCIDRNumericalCIDR()
    {
        $sn = new Esc_Validate_Subnet('192.168.0.1/abc');
        $this->assertTrue($sn->isValid('192.168.0.2'));
    }
 
    /**
     * @expectedException Zend_Validate_Exception
     */
 
    public function testMalformedCIDRLowCIDR()
    {
        $sn = new Esc_Validate_Subnet('192.168.0.1/-1');
        $this->assertTrue($sn->isValid('192.168.0.2'));
    }
    /**
     * @expectedException Zend_Validate_Exception
     */
 
    public function testMalformedCIDRHighCIDR()
    {
        $sn = new Esc_Validate_Subnet('192.168.0.1/33');
        $this->assertTrue($sn->isValid('192.168.0.2'));
    }
 
}

 

Feel free to peruse and see if I have made any mistakes.  I wrote it pretty quickly and so there is, in all likelihood, something that could be tweaked.

ZendCon 2010 Podcast – A New Approach To Object Persistence In PHP

Speaker

Stefan Priebsch

Abstract

The object-relational impedance mismatch makes persisting PHP objects in a relational database a daunting task. How about these new schemaless NoSQL databases? We will have a look at the problems involved with persisting PHP objects, and introduce design patterns that help solving these problems. Putting the patterns to good use, we will build a working PHP object persistence solution for MongoDB.

Licensing:

The ZendCon Sessions are distributed under a creative commons Attribution-Noncommercial-No Derivative Works 3.0 License, Please honor this license and the rights of our authors.


Slides

View Slides (SlideShare)

Podcast Download

Download as MP3

Pre-caching FTW

I just had an epiphany.  I’ve talked about pre-caching content before and the benefits thereof before.  But this is the first time I realized not only that there are benefits, but that doing it is BETTER than caching inline.  Let me sum up… no, there is to much.  Let me explain.

Typically caching is done like this (stolen from the ZF caching docs):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
$id = 'myBigLoop'; // cache id of "what we want to cache"
 
if ( ($data = $cache->load($id)) === false ) {
    // cache miss
 
    $data = '';
    for ($i = 0; $i < 10000; $i++) {
        $data = $data . $i;
    }
 
    $cache->save($data);
 
}
 
echo $data;

Pretty easy.  But what happens if you have code like this:

1
2
3
4
5
6
$options = $app->getOption('google');
$client = Zend_Gdata_ClientLogin::getHttpClient(
       $options['username'],
       $options['password'],
       Zend_Gdata_Analytics::AUTH_SERVICE_NAME
);

What’s so important about this code?  Is it because it is of a remote nature?  Is it because it uses GData?  Nope.  It’s because it has a username and a password.  Given the previous caching what happens if that password changes (like mine did)?  Your site is down.

So, why do I now think that pre-caching is better than inline caching?  Look at my front page.  You would never know that I’m currently having a problem because it’s still reading from the same cache key (with non-expiring data).

THAT is why I’m forming the opinion that pre-caching/asynchronous caching not only has benefits over inline caching, but that it may actually be better.  I’m not one to make blanket statements, and I’m not going to.  But I am toying with the idea of using pre-caching as the default mechanism for caching instead of the other way around.

Web Analytics