Self-referential URLs


Self-referential URLs



What's the most reliable, generic way to construct a self-referential URL? In other words, I want to generate the http://www.site.com[:port] portion of the URL that the user's browser is hitting. I'm using PHP running under Apache.

A few complications:

  • Relying on $_SERVER["HTTP_HOST"] is dangerous, because that seems to come straight from the HTTP Host header, which someone can forge.

  • There may or may not be virtual hosts.

  • There may be a port specified using Apache's Port directive, but that might not be the port that the user specified, if it's behind a load-balancer or proxy.

  • The port may not actually be part of the URL. For example, 80 and 443 are usually omitted.

  • PHP's $_SERVER["HTTPS"] doesn't always give a reliable value, especially if you're behind a load-balancer or proxy.

  • Apache has a UseCanonicalName directive, which affects the values of the SERVER_NAME and SERVER_PORT environment variables. We can assume this is turned on, if that helps.




How can I store my Rewrite Rules in a database?

1:



apache permissions error
The most reliable way is to provide it yourself..
newly installed apache serving html content as text
The site should be coded to be hostname neutral, but to know about a special configuration file.


how to set a header when serving a static file
This file doesn't get put into source control for the codebase because it belongs to the webserver's configuration.


How Do I Restrict Repository Access via WebSVN?
The file is used to set things like the hostname and other webserver-specific parameters.


.htaccess not working (mod_rewrite)
You can accomodate load balancers, changing ports, etc, because you're saying if an HTTP request hits that code, then it can assume however much you will let it assume..
Why do images served from my web server not cache on the client?
This trick also helps development, incidentally.


PHP stops rendering page somewhat arbitrarily
:-).


2:


I would suggest that the only way to be sure and to be secure is to define a constant for the url in some kind of config file for the site.

You could generate the constant with $_SERVER['HTTP_HOST'] as a default and replace with a hard coded definition on deployments where security really matters..
define('SITE_URL', $_SERVER['HTTP_HOST']); 
and replace as needed:.
define('SITE_URL', 'http://foo.bar.com:8080/'); 


3:


As I recall, you want to do something like this:.
$protocol = 'http';  if ( (!empty($_SERVER['HTTPS'])) || ($_SERVER['HTTPS'] == 'off') ) {     $protocol = 'https';     if ($_SERVER['SERVER_PORT'] != 443)         $port = $_SERVER['SERVER_PORT']; } else if ($_SERVER['SERVER_PORT'] != 80) {     $port = $_SERVER['SERVER_PORT']; } // Server name is going to be whatever the virtual host name is set to in your configuration $address = $protocol . 

'://' .

$_SERVER['SERVER_NAME']; if (!empty($port)) $address .= ':' .

$port $address .= $_SERVER['REQUEST_URI']; // Optional, if you want the query string intact if (!empty($_SERVER['QUERY_STRING'])) $address .= '?' .

$_SERVER['QUERY_STRING'];
I haven't tested this code, because I don't have PHP handy at the moment..


4:


$_SERVER["HTTP_HOST"] is probably the best way, after some validation of course.

. Yes, the user specifies it and so it cannot be trusted, but you can easily detect when the user is playing games with it..


5:


One idea for validating that $_SERVER['HTTP_HOST'] is valid could be to validate it by DNS.

I've used this method in one or two cases without serious consequences to speed and I believe this method fails silently if provided a IP address.. http://www.php.net/manual/en/function.gethostbyname.php. Peusudo code might be:.
define('SITEHOME', in_array(gethostbyname($_SERVER['HTTP_HOST']), array(... 

valid IP's))) ? $_SERVER['HTTP_HOST'] : 'default_hostname';


6:


why {if you wish the user to continue using http:///host:port/ that they are on do you wish to generate full urls} whan you can use relative urls instead of either. say on page http://xxx:yy/zzz/fff/. you culd use either. ../graphics/whatever.jpg {to go back one directory from current and get http://xxx:yy/zzz/graphics/whatever.jpg. or /zzz/graphics/whatever.jpg {to goto site root and work up the directories as specified}. these both avoid mentioning the host:port part and inherit it from the one currently in use.



74 out of 100 based on 44 user ratings 1094 reviews