Alternative for file_get_contents() using cURL
file_get_contents() is the easiest and most widely used function to get the contents of a file. Just this simple function call could get the contents of a file for you. But if you wish to work on remote files, then you need to enable ‘allow_url_fopen’ in your ‘php.ini’ settings. But enabling ‘allow_url_fopen’ is a risk, as it has various security loop holes. So, it’s always good to disable it and look for an alternative to get the contents of a file.
Some hosts disable the function file_get_contents() for remote files. In that case you are forced to look for an alternative. In some cases file_get_contents() returns NULL or an empty string for perfectly right URLs.
Because of all this, a code using file_get_contents() which works perfectly on one machine or host may not work on another. So it’s always better to use an alternative to get the job done, even if file_get_contents() is working on your system.
The function below, which uses cURL, fulfils our objective. It is much efficient than file_get_contents(). It can be used for remote files also.
Function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | function curl_file_get_contents($url) { $curl = curl_init(); $userAgent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)'; curl_setopt($curl,CURLOPT_URL,$url); //The URL to fetch. This can also be set when initializing a session with curl_init(). curl_setopt($curl,CURLOPT_RETURNTRANSFER,TRUE); //TRUE to return the transfer as a string of the return value of curl_exec() instead of outputting it out directly. curl_setopt($curl,CURLOPT_CONNECTTIMEOUT,5); //The number of seconds to wait while trying to connect. curl_setopt($curl, CURLOPT_USERAGENT, $userAgent); //The contents of the "User-Agent: " header to be used in a HTTP request. curl_setopt($curl, CURLOPT_FAILONERROR, TRUE); //To fail silently if the HTTP code returned is greater than or equal to 400. curl_setopt($curl, CURLOPT_FOLLOWLOCATION, TRUE); //To follow any "Location: " header that the server sends as part of the HTTP header. curl_setopt($curl, CURLOPT_AUTOREFERER, TRUE); //To automatically set the Referer: field in requests where it follows a Location: redirect. curl_setopt($curl, CURLOPT_TIMEOUT, 10); //The maximum number of seconds to allow cURL functions to execute. $contents = curl_exec($curl); curl_close($curl); return $contents; } |
Lines 10 to 14 may not be required in all cases. But I have included them to be on the safe side.
Usage:
$file_content = curl_file_get_contents('http://25labs.com'); |
Another advantage of this cURL method is that, it is faster than traditional file_get_contents(). I just tried to verify the same using below mentioned timer class. The timer class can be used to benchmark how long queries, functions, and entire pages are taking to complete.
Function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | class timer { var $start; var $pause_time; function timer($start = 0){ //Start the timer if($start) $this->start(); } function start(){ //Start the timer $this->start = $this->get_time(); $this->pause_time = 0; } function pause(){ //Pause the timer $this->pause_time = $this->get_time(); } function unpause(){ //Unpause the timer $this->start += ($this->get_time() - $this->pause_time); $this->pause_time = 0; } function get($decimals = 8){ //Get the current timer value return round(($this->get_time() - $this->start),$decimals); } function get_time(){ //Format the time in seconds list($usec,$sec) = explode(' ', microtime()); return ((float)$usec + (float)$sec); } } |
Usage:
1 2 3 4 5 6 7 8 9 | $timer = new timer(1); // Constructor starts the timer, so no need to do it ourselves /* ... mysql query ...*/ $query_time = $timer->get(); /* ... page processing ...*/ $processing_time = $timer->get(); |
Courtesy: http://davidwalsh.name/php-timer-benchmark/
The results of my tests were remarkable. The cURL method proved to be 2-3 times faster. The average increase in speed for downloading www.google.com was about 260% and for 25labs.com it was about 200%. The complete test results are shown below.
Google.com
| cURL [seconds] |
file_get_contents() [seconds] |
Difference [seconds] |
Increase in speed [percentage] |
|---|---|---|---|
| 0.888551 | 4.50224614 | 3.61369514 | 406.6952983 |
| 0.48516011 | 1.05094695 | 0.56578684 | 116.6185819 |
| 1.12226319 | 2.02632523 | 0.90406204 | 80.55704295 |
| 0.48129988 | 1.03340602 | 0.55210614 | 114.7114643 |
| 1.47997999 | 4.94326782 | 3.46328783 | 234.0090983 |
| 1.88002896 | 5.40903306 | 3.5290041 | 187.7100925 |
| 0.45613194 | 2.84422398 | 2.38809204 | 523.5529088 |
| 0.49854517 | 0.97496605 | 0.47642088 | 95.5622296 |
| 0.46968794 | 0.97128391 | 0.50159597 | 106.7934531 |
| 0.49630809 | 4.06055999 | 3.5642519 | 718.1530932 |
| Average increase in speed [percentage] | 258.4363263 | ||
25labs.com
| cURL [seconds] |
file_get_contents() [seconds] |
Difference [seconds] |
Increase in speed [percentage] |
|---|---|---|---|
| 4.55120611 | 9.20000005 | 4.64879394 | 102.1442191 |
| 2.00834012 | 5.72773004 | 3.71938992 | 185.1972125 |
| 1.92637897 | 4.68373394 | 2.75735497 | 143.1366835 |
| 1.42010713 | 4.20022202 | 2.78011489 | 195.7679693 |
| 1.75494885 | 6.52712488 | 4.77217603 | 271.9267875 |
| 1.684129 | 8.10363817 | 6.41950917 | 381.1768083 |
| 1.66302896 | 5.54452515 | 3.88149619 | 233.3991941 |
| 2.46324897 | 5.85300899 | 3.38976002 | 137.6133741 |
| 1.62948298 | 4.60624599 | 2.97676301 | 182.6814423 |
| 1.99712992 | 5.29414988 | 3.29701996 | 165.0879057 |
| Average increase in speed [percentage] | 199.8131596 | ||











Secure pages (https://) returns blank response. Is there a way to retrieve the contents of secure pages using CURL?
Just add the lines below to the function, if you wish to retrieve the contents of a secure webpage.
great article! thanks for the help!
Just add the lines below to the function, if you wish to retrieve the contents of a secure webpage.
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, FALSE)
Thanks it work for me.