Alternative for file_get_contents() using cURL

file_get_contents() is the easiest and most widely used function to get the contents of a file. Just this simple function call could get the contents of a file for you. But if you wish to work on remote files, then you need to enable ‘allow_url_fopen’ in your ‘php.ini’ settings. But enabling ‘allow_url_fopen’ is a risk, as it has various security loop holes. So, it’s always good to disable it and look for an alternative to get the contents of a file.

Some hosts disable the function file_get_contents() for remote files. In that case you are forced to look for an alternative. In some cases file_get_contents() returns NULL or an empty string for perfectly right URLs.

Because of all this, a code using file_get_contents() which works perfectly on one machine or host may not work on another. So it’s always better to use an alternative to get the job done, even if file_get_contents() is working on your system.

The function below, which uses cURL, fulfils our objective. It is much efficient than file_get_contents(). It can be used for remote files also.

Function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
function curl_file_get_contents($url)
{
 $curl = curl_init();
 $userAgent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)';
 
 curl_setopt($curl,CURLOPT_URL,$url); //The URL to fetch. This can also be set when initializing a session with curl_init().
 curl_setopt($curl,CURLOPT_RETURNTRANSFER,TRUE); //TRUE to return the transfer as a string of the return value of curl_exec() instead of outputting it out directly.
 curl_setopt($curl,CURLOPT_CONNECTTIMEOUT,5); //The number of seconds to wait while trying to connect.	
 
 curl_setopt($curl, CURLOPT_USERAGENT, $userAgent); //The contents of the "User-Agent: " header to be used in a HTTP request.
 curl_setopt($curl, CURLOPT_FAILONERROR, TRUE); //To fail silently if the HTTP code returned is greater than or equal to 400.
 curl_setopt($curl, CURLOPT_FOLLOWLOCATION, TRUE); //To follow any "Location: " header that the server sends as part of the HTTP header.
 curl_setopt($curl, CURLOPT_AUTOREFERER, TRUE); //To automatically set the Referer: field in requests where it follows a Location: redirect.
 curl_setopt($curl, CURLOPT_TIMEOUT, 10); //The maximum number of seconds to allow cURL functions to execute.	
 
 $contents = curl_exec($curl);
 curl_close($curl);
 return $contents;
}

Lines 10 to 14 may not be required in all cases. But I have included them to be on the safe side.

Usage:

$file_content = curl_file_get_contents('http://25labs.com');

Another advantage of this cURL method is that, it is faster than traditional file_get_contents(). I just tried to verify the same using below mentioned timer class. The timer class can be used to benchmark how long queries, functions, and entire pages are taking to complete.

Function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
class timer
{
	var $start;
	var $pause_time;
 
	function timer($start = 0){	//Start the timer
		if($start)	$this->start();
	}
 
	function start(){	//Start the timer
		$this->start = $this->get_time();
		$this->pause_time = 0;
	}
 
	function pause(){	//Pause the timer
		$this->pause_time = $this->get_time();
	}
 
	function unpause(){	//Unpause the timer
		$this->start += ($this->get_time() - $this->pause_time);
		$this->pause_time = 0;
	}
 
	function get($decimals = 8){	//Get the current timer value
		return round(($this->get_time() - $this->start),$decimals);
	}
 
	function get_time(){	//Format the time in seconds
		list($usec,$sec) = explode(' ', microtime());
		return ((float)$usec + (float)$sec);
	}
}

Usage:

1
2
3
4
5
6
7
8
9
$timer = new timer(1); // Constructor starts the timer, so no need to do it ourselves
 
/* ... mysql query ...*/
 
$query_time = $timer->get();
 
/* ... page processing ...*/
 
$processing_time = $timer->get();

Courtesy: http://davidwalsh.name/php-timer-benchmark/

The results of my tests were remarkable. The cURL method proved to be 2-3 times faster. The average increase in speed for downloading www.google.com was about 260% and for 25labs.com it was about 200%. The complete test results are shown below.

Google.com

cURL
[seconds]
file_get_contents()
[seconds]
Difference
[seconds]
Increase in speed
[percentage]
0.888551 4.50224614 3.61369514 406.6952983
0.48516011 1.05094695 0.56578684 116.6185819
1.12226319 2.02632523 0.90406204 80.55704295
0.48129988 1.03340602 0.55210614 114.7114643
1.47997999 4.94326782 3.46328783 234.0090983
1.88002896 5.40903306 3.5290041 187.7100925
0.45613194 2.84422398 2.38809204 523.5529088
0.49854517 0.97496605 0.47642088 95.5622296
0.46968794 0.97128391 0.50159597 106.7934531
0.49630809 4.06055999 3.5642519 718.1530932
Average increase in speed [percentage] 258.4363263

25labs.com

cURL
[seconds]
file_get_contents()
[seconds]
Difference
[seconds]
Increase in speed
[percentage]
4.55120611 9.20000005 4.64879394 102.1442191
2.00834012 5.72773004 3.71938992 185.1972125
1.92637897 4.68373394 2.75735497 143.1366835
1.42010713 4.20022202 2.78011489 195.7679693
1.75494885 6.52712488 4.77217603 271.9267875
1.684129 8.10363817 6.41950917 381.1768083
1.66302896 5.54452515 3.88149619 233.3991941
2.46324897 5.85300899 3.38976002 137.6133741
1.62948298 4.60624599 2.97676301 182.6814423
1.99712992 5.29414988 3.29701996 165.0879057
Average increase in speed [percentage] 199.8131596