Shared posts

03 Jul 04:54

Understanding Streams in PHP

by Vito Tardia

Streams are resources provided by PHP that we often use transparently, but which can also be very powerful tools. By learning how to harness their power, we can take our applications to a higher level.

The PHP manual has a great description of streams:

Streams were introduced with PHP 4.3.0 as a way of generalizing file, network, data compression, and other operations which share a common set of functions and uses. In its simplest definition, a stream is a resource object which exhibits streamable behavior. That is, it can be read from or written to in a linear fashion, and may be able to fseek() to an arbitrary locations within the stream.

Every stream has a implementation wrapper which has the additional code necessary to handle the specific protocol or encoding. PHP provides some built-in wrappers and we can easily create and register custom ones. We can even modify or enhance the behavior of wrappers using contexts and filters.

Stream Basics

A stream is referenced as <scheme>://<target>. <scheme> is the name of the wrapper, and <target> will vary depending on the wrapper’s syntax.

The default wrapper is file:// which means we use a stream every time we access the filesystem. We can either write readfile('/path/to/somefile.txt') for example or readfile('file:///path/to/somefile.txt') and obtain the same result. If we instead use readfile('http://google.com/') then we’re telling PHP to use the HTTP stream wrapper.

As I said before, PHP provides some built-in wrappers, protocols, and filters. To know which wrappers are installed on our machine we can use:

<?php
print_r(stream_get_transports());
print_r(stream_get_wrappers());
print_r(stream_get_filters());

My installation outputs the following:

Array
(
    [0] => tcp
    [1] => udp
    [2] => unix
    [3] => udg
    [4] => ssl
    [5] => sslv3
    [6] => sslv2
    [7] => tls
)
Array
(
    [0] => https
    [1] => ftps
    [2] => compress.zlib
    [3] => compress.bzip2
    [4] => php
    [5] => file
    [6] => glob
    [7] => data
    [8] => http
    [9] => ftp
    [10] => zip
    [11] => phar
)
Array
(
    [0] => zlib.*
    [1] => bzip2.*
    [2] => convert.iconv.*
    [3] => string.rot13
    [4] => string.toupper
    [5] => string.tolower
    [6] => string.strip_tags
    [7] => convert.*
    [8] => consumed
    [9] => dechunk
    [10] => mcrypt.*
    [11] => mdecrypt.*
)

A nice set, don’t you think?

In addition we can write or use third-party streams for Amazon S3, MS Excel, Google Storage, Dropbox and even Twitter.

The php:// Wrapper

PHP has its own wrapper to access the language’s I/O streams. There are the basic php://stdin, php://stdout, and php://stderr wrappers that map the default I/O resources, and we have php://input that is a read-only stream with the raw body of a POST request. This is handy when we’re dealing with remote services that put data payloads inside the body of a POST request.

Let’s do a quick test using cURL:

curl -d "Hello World" -d "foo=bar&name=John" http://localhost/dev/streams/php_input.php

The result of a print_r($_POST) in the responding PHP script would be:

Array
(
    [foo] => bar
    [name] => John
)

Notice that the first data pack isn’t accessible from the $_POST array. But if we use readfile('php://input') instead we get:

Hello World&foo=bar&name=John

PHP 5.1 introduced the php://memory and php://temp stream wrappers which are used to read and write temporary data. As the names imply, the data is stored respectively in memory or in a temporary file managed by the underlying system.

There’s also php://filter, a meta-wrapper designed to apply filters when opening a stream with function like readfile() or file_get_contents()/stream_get_contents().

<?php
// Write encoded data
file_put_contents("php://filter/write=string.rot13/resource=file:///path/to/somefile.txt","Hello World");

// Read data and encode/decode
readfile("php://filter/read=string.toupper|string.rot13/resource=http://www.google.com");

The first example uses a filter to encode data written to disk while the second applies two cascading filters reading from a remote URL. The outcome can be from very basic to very powerful in our applications.

Stream Contexts

A context is a stream-specific set of parameters or options which can modify and enhance the behavior of our wrappers. A common use context is modifying the HTTP wrapper. This lets us avoid the use of cURL for simple network operations.

<?php
$opts = array(
  'http'=>array(
    'method'=>"POST",
    'header'=> "Auth: SecretAuthToken\r\n" .
        "Content-type: application/x-www-form-urlencoded\r\n" .
              "Content-length: " . strlen("Hello World"),
    'content' => 'Hello World'
  )
);
$default = stream_context_get_default($opts);
readfile('http://localhost/dev/streams/php_input.php');

First we define our options array, an array of arrays with the format $array['wrapper']['option_name'] (the available context options vary depending on the specific wrapper). Then we call stream_context_get_default() which returns the default context and accepts an optional array of options to apply. The readfile() statement uses these settings to fetch the content.

In the example, the content is sent inside the body of the request so the remote script will use php://input to read it. We can access the headers using apache_request_headers() and obtain:

Array
(
    [Host] => localhost
    [Auth] => SecretAuthToken
    [Content-type] => application/x-www-form-urlencoded
    [Content-length] => 11
)

We’ve modified the default context options, but we can create alternative contexts to be used separately as well.

<?php
$alternative = stream_context_create($other_opts);
readfile('http://localhost/dev/streams/php_input.php', false, $alternative);

Conclusion

How can we harness the power of streams in the real world? And where can we go from here? As we’ve seen, streams share some or all of the filesystem related functions, so the first use that comes to my mind is a series of virtual filesystem wrappers to use with PaaS providers like Heroku or AppFog that don’t have a real filesystem. With little or no effort we can port our apps from standard hosting services to these cloud services and enjoy the benefits. Also – and I’ll show in a follow-up article – we can build custom wrappers and filters for our applications that implementing custom file formats and encoding.

03 Jul 04:53

The PHP.cc: PHP 5.5: New CLASS Constant

The PHP.cc have posted another article in their series looking at the new features that come with the latest release of PHP (5.5). In this new post they cover the "CLASS" constant.

Last week, the first stable version of PHP 5.5 was released. It introduced a class-level constant, aptly named CLASS, that is automatically available on all classes and holds the fully-qualified name of that class. [...] So why would you need such a constant? [...] When you need the fully qualified name of a namespaced class that is referenced by a namespace alias ... then it gets interesting.

He illustrates with an example of a unit test using stubs and mocks. The normal method requires the definition of the class namespace in the "getMock" call. With the CLASS constant, PHP can extract that information from the namespace referenced in the "use" and drop it in as a replacement.

Link: http://thephp.cc/viewpoints/blog/2013/06/php-5-5-new-class-constant
03 Jul 04:53

Running Monte Carlo Simulations in PHP

by J Armando Jeronymo

One of the exciting things in the 1980′s was programming simulations to solve complex analytical problems, and one of the most useful techniques employed was running Monte Carlo simulations. The approach repeatedly runs a simulation many times over to calculate the most likely outcome.

Although PHP isn’t known as a scientific or research programming language, Monte Carlo simulations can easily be written in a PHP web app. In this article, I’ll show you how.

A Real-World Problem

The day before yesterday I had a meeting at 9:00 AM about 100 miles away from my home. At 6:30 I was awake and dressed, and while having breakfast I began to work out the next couple of hours on a notepad. I obviously wanted to arrive at the meeting safely and on time, so I started sketching the drive split in eight stages: urban departure, country road, state road northbound, state road eastbound, state highway eastbound, urban crossing, state highway southbound, and urban arrival. The sketch more or less looked like this:

montecarlo1

My wife had filled the gas tank the evening before and ideally I could drive straight out to the country road. The tires seemed alright when I looked at them, but doubt whether or not to make a 10 minute detour to properly check their pressure hounded me. If I stopped and checked the tires, I’d be certain of their pressure. Otherwise, I’d have to travel with the uncertainty. Poor tire pressure could have an effect on my driving stability and speed.

I could leave as early as 6:40 but then my wife would have to take our daughter to school instead of going straight to work. If I waited another 10 minutes, I could be at the school as their gates opened for the morning and spare my wife the trouble. The school is on my way out of town so there’d be no significant added travel time.
I went back to my drawing and added the following:

montecarlo2

Sipping my second cup of coffee, I stood by the window. A clear dusk sky and brisk morning breeze agreed with the perfect day forecast on my smartphone which made me believe the drive would be a fast one. To finish my planning, I drew from past experience and wrote down the estimated drive times:

montecarlo3

The expected time en route was 115 minutes (1 hour 55 minutes), which I could cover non-stop. My ETA was 8:35 if I left straight for the road and 8:55 if I decided take my daughter to school and check the tires.

But no plan survives its first encounter with the enemy! For some mysterious reason, many other parents decided to drop their children off at school earlier than usual, and I had lost more than 5 minutes on what was planned to be a quick detour. Knowing that my baseline was already compromised, I decided to skip the tire check and drive straight to the country road.

I reached the road actually five minutes sooner than the original worst case estimate and the drive went well until, somewhere between milestones B and C, I ran into dense fog which reduced my visibility. This reduced my average speed and made it harder to overtake the slow but long trucks.

The urban traffic in the town I had to cross was much lighter than usual and it didn’t take me more than 10 minutes to go across. And a few miles onto state highway southbound the fog lifted so I was able to drive at the full legal speed. But when I approached my destination, I realized that some road work that was in progress had blocked the exit I planned to take. This added another 10 minutes to my travel time and needless to say I arrived late.

Modeling the Trip in PHP

I believe most PHP coding is dedicated to profit and non-profit business websites. But PHP can be a fine tool for scientific research and might as well be easier to teach non-professional programmers, like engineers and scientists, than other languages like my beloved Python.

Let’s write the basic code that will help me understand how much earlier or later I could have arrived at the meeting if one or more stages of my plan varied substantially from their baseline estimates. We can begin to model the trip as follows:

<?php 
class MyTrip 
{ 
    protected $departureTime; 
    protected $meetingTime; 
    protected $travelTimes; 

    public function __construct() { 
        $this->setDepartureTime('0640'); 
        $this->setMeetingTime('0900'); 

        // travel times in minutes between milestones 
        $this->setTravelTimes(array( 
            'AB' => 17, 
            'BC' => 17, 
            'CD' => 36, 
            'DE' => 9, 
            'EF' => 15, 
            'FG' => 15, 
            'GH' => 6 
        )); 
    } 

    // for convenience convert time string to minutes past
    // midnight 
    protected static function convertToMinutes($timeStr) { 
        return substr($timeStr, 0, 2) * 60 +
            substr($timeStr, 2, 2); 
    } 

    public function setDepartureTime($timeStr) { 
        $this->departureTime = self::convertToMinutes($timeStr); 
    } 

    public function setMeetingTime($timeStr) { 
        $this->meetingTime = self::convertToMinutes($timeStr); 
    } 

    public function setTravelTimes(array $travelTimes) { 
        $this->travelTimes = $travelTimes; 
    } 

    public checkPlan($stopAtSchool = true, $checkTires = true) {
        // ...
    }
}

Plans must be feasible, and the suitable criteria to judge this one is whether the sum of all times plus the earliest departure time is less than or equal to the time of the meeting. That is what the checkPlan() method determines:

<?php
public checkPlan($stopAtSchool = true, $checkTires = true) {
    // calculate the sum of travel times between milestones
    $travelTime = array_sum($this->travelTimes);

    // add delay if dropping kid off at school
    $schoolDelay = ($stopAtSchool) ? 10 : 0;

    // add delay if checking tire pressure
    $tiresDelay = ($checkTires) ? 10 : 0;

    // find the total schedule baseline
    $meetingArriveTime = $this->departureTime + $travelTime +
        $schoolDelay + $tiresDelay;

    // does the traveller make the meeting on time?
    $arriveOnTime = $meetingArriveTime <= $this->meetingTime;

    return array($meetingArriveTime, $this->meetingTime,
        $arriveOnTime);
}

Now all we have to do is create an instance of the class and ask it to check my plan:

<?php
$trip = new MyTrip();
print_r($trip->checkPlan());

Given the default values, the above will output that my original plan was okay:

Array
(
    [0] => 535
    [1] => 540
    [2] => 1
)

I should be there 535 minutes after midnight, and the meeting takes place 540 minutes after midnight. According to the baseline, I’ll arrive at the meeting at 8:45 AM, just 5 minutes before the scheduled time!

But what about the inevitable variations? How can we account for the uncertain elements?

Monte Carlo and Adding Randomness

In a very simplistic way we can define a safety margin to every event and say it could happen 10% earlier and 25% later of the scheduled time. Such margins can be randomly added to the departure delays and every travel time between milestones by multiplying each factor by rand(90,125)/100.

We can also assign a 50% probability for both decisions to drop my daughter off at school and to check the tires. Again the rand() function can helps us:

$this->checkTires = rand(1, 100) > 50;

Putting it all together, we can define a method checkPlanRisk() to computer whether or not I can arrive on time given the many uncertainties that stand in my way:

<?php
public function checkPlanRisk() {
    // adjust and sum travel times between milestones
    $travelTime = 0;
    foreach ($this->travelTimes as $t) {
        $travelTime += $t * rand(90, 125) / 100;
    }

    // decide whether to drop kid off at school and randomly set
    // the delay time
    $schoolDelay = 0;
    if (rand(1, 100) > 50) {
        $schoolDelay = 10 * rand(90, 125) / 100;
    }
    
    // ditto for checking tires
    $tiresDelay = 0;
    if (rand(1, 100) > 50) {
        $tiresDelay = 10 * rand(90, 125) / 100;
    }

    // find the total schedule baseline
    $meetingArriveTime = $this->departureTime + $travelTime +
        $schoolDelay + $tiresDelay;

    // does the traveller make the meeting on time?
    $arriveOnTime = $meetingArriveTime <= $this->meetingTime;

    return array($schoolDelay, $tiresDelay, $meetingArriveTime,
        $this->meetingTime, $arriveOnTime);
}

The question now is, how likely is the traveller to arrive on time given the initial conditions and the assumed uncertainty? Monte Carlo simulations answer this by running a large number of times and computing a “level of confidence”, defined as the ratio of arrivals on time to total number of trials.

If we run this method a sufficient number of times and record how often the traveller arrives on time, we can establish some sort of margin of confidence whether the trip is feasible as planned.

<?php
public function runCheckPlanRisk($numTrials) {
    $arriveOnTime = 0;
    for ($i = 1; $i <= $numTrials; $i++) {
        $result = $this->checkPlanRisk();
        if ($result[4]) {
            $arriveOnTime++;
        }

        echo "Trial: " . $i;
        echo " School delay: " . $result[0];
        echo " Tire delay: " . $result[1];
        echo " Enroute time: " . $result[2];

        if ($result[4]) {
            echo " -- Arrive ON TIME";
        }
        else {
            echo " -- Arrive late";
        }

        $confidence = $arriveOnTime / $i;
        echo "\nConfidence level: $confidence\n\n";
    }
}

Creating a new instance of MyTrip and asking it compute the confidence level for 1,000 trials is straightforward:

<?php
$trip = new MyTrip();
$trip->runCheckPlanRisk(1000);

The output will be a screen print-out of 1,000 entries like this:

Trial: 1000 School delay: 0 Delay tires: 11.3 Enroute time: 530.44 -- Arrive ON TIME
Confidence level: 0.716

With the parameters above, the confidence level seems to converge to 0.72, which roughly indicates that the traveler has 72% chance of getting to the meeting on time.

Grosso modo, Monte Carlo relies on the convergence of the mean result of a sequence of identical experiments. The mean result is also known as the expected value and can be thought of as the probability of a desired result occurring. Of course this is quite an old concept and such simulations have been done a long time before the arrival of the digital computer.

Conclusion

Today, we have immensely powerful resources at our disposal and a very convenient language like PHP which can be used to create both the simulation logic and a user friendly web interface for it. With a few lines of code, we created a model and applied Monte Carlo simulations with it, producing an intelligible result useful for business decisions. This is a field worth exploring and it would be nice to see some scientific computation functions added to PHP so scientists and engineers could take advantage of this versatile, non-nonsense language.

Code for this article is available on PHPMaster’s GitHub page.

Image via Fotolia

03 Jul 04:30

Realistic Criteria

I'm leaning toward fifteen. There are a lot of them.