Best practices

Principles

Whether you want to develop a web application or a smartphone one, you will probably be concerned with its successful operation and availability. Here is some advice to allow you to get the best from our platform.

Direct access to restricted datasets

As you may have noticed, some of our data is only available if you already have an account and the authorization to use it. As soon as you implement these pieces of information in your code, it’s up to you to take care of their safety, non-disclosure and their ease of use all through their life cycle (software updates, deletion…)

In Javascript/Ajax

One of the peculiarities of Javascript is being fully loaded and executed on the client side. If you integrate your credentials in your javascript code, they will be automatically accessible to any user of your website. Using tricks and workarounds can certainly make their retrieval more difficult, but you can be sure that sooner or later they will appear in the request you will send towards our services. Thus, for any Javascript/Ajax development, you must use the Proxyfication techniques for all the requests described below.

In PHP, Python, Java

No problem with these server-side languages for which you can follow the provided Examples and code snippets

On mobile platform

Applications developed on the mobile platform are quite safe, as they are generally compiled and are thus distributed to the users in a format which is really complicated to reverse-engineer and therefore retrieve the information written into it. On the other hand, you must consider a practical aspect tied to your credentials life cycle. Should you change your login, or even your account, how will the diffusion occur of the new credentials among all the devices using your software? Will the user have to update their software? Can you force this software to be mandatory? How many users are you ready to lose in such an operation? You must take this into account before starting to spread out your solution.

Performance

Another important question is about our platform performance. It is not sized to carry out the same workload as www.google.fr. If your application is successful, which we hope it will be, if a single tweet boosts it towards popularity summits very quickly, are you sure that our infrastructures will hold the sudden load and allow your users to have the best experience with your application?

Conclusion

For each of the below situations, you have to take care of the implementation of the recommended practices. The simplest way is sometimes to directly retrieve the data and serve it from your own premises.

Data retrieval

Your user rights allow you to retrieve the data and store it on your own servers. The retrieval frequency must be set according to their update frequency. Once implemented into your infrastructure, the resolution of the previously mentioned problems is far easier:

  • on mobile, the users direct access to your own version of the data. If your credentials change, it only affects your data retrieval script and not the application deployed on your users devices, which can use your own authentication/authorization system and procedures.
  • for performance : punctual retrieval, even frequent, has a minimal effect on our platform . Provided that your own infrastructure is correctly sized to hold the load generated by your application, you’ll have no problem to fear on our side.

Proxyfication

Having the data installed in your infrastructure or directly accessed from your application to our services, there are use cases when the usage of proxyfication is however mandatory. It’s mostly the case in a Javascript environment with c- with which you have two major concerns:

  • the impossibility of hiding the credentials to use for accessing the data. You can however have set, after having retrieved the data, a user specific authentication system, in which case you have already solved a part of the problem.
  • Access to a third-party domain in Ajax. This is strictly forbidden by the “Same Original Policy” which forces the ajax retrieved content to be located in the same domain and same port as the main web page from which the request has been emitted. This rule doesn’t apply to images, which explains why we can have WMS external streams working perfectly well, but applies to all other types of services : WFS, JSON or KML. Please note that our servers are configured for allowing external domain connections through Ajax request, but if you rely on other data servers, you may face this kind of situation.

In order to get around these restrictions, you can use the proxyfication trick. The basic principle is to send the Ajax requests to a script located on your own server (so having the same origin and port as your main page), which will transfer these requests to the external service, without suffering from the same constraints as the client. This can also be done to distinguish the credentials used to access your server from the ones used to access ours.

WEB CLIENT——> PROXY SCRIPT ——> EXTERNAL SERVICE
user:toto………user:my_account……>

Implementing a proxyfication script is rather trivial, but you have to respect a few rules. Don’t leave it open to anybody for instance, or it will be used by malicious people to surf the internet anonymously. And don’t let it do whatever the user asks it to either. Control the accepted operations and reject all others.

Python implementation

The python script used below is shipped with OpenLayers under the proxy.cgi name. This python script perfectly fits the needs for WFS usage. To activate it, you have to put it in an executable directory of the web server (generally cgi-bin suits this need), set its own rights to make it executable (i.e. chmod a+x proxy.cgi) and indicate its existence to OpenLayers for it to use it as a proxy (because OpenLayers is smart but not psychic !) with the directive :

OpenLayers.ProxyHost = "/cgi-bin/proxy.cgi?url=";

For any further information about the use of OpenLayers and a Proxy script, please refer to the dedicated FAQ : http://trac.osgeo.org/openlayers/wiki/FrequentlyAskedQuestions#ProxyHost

#!/usr/local/bin/python


"""This is a blind proxy that we use to get around browser
restrictions that prevents the Javascript from loading pages not on the
same server as the Javascript.  This has several problems: it's less
efficient, it might break some sites, and it's a security risk because
people can use this proxy to browse the web and possibly do unacceptable
activities with it.  It only loads pages via http and https, but it can    load any content type. It supports GET and       POST requests."""

import urllib2
import cgi
import sys, os

# Designed to prevent Open Proxy type stuff.
# replace 'my_target_server' by the external domain you are aiming to
allowedHosts = ['localhost','my_target_server']

method = os.environ["REQUEST_METHOD"]

if method == "POST":
    qs = os.environ["QUERY_STRING"]
    d = cgi.parse_qs(qs)

    # checks if a url parameter exists in the POST request. If not, go to hell.
    if d.has_key("url"):
        url = d["url"][0]
    else:
        url = "http://www.openlayers.org"
else:
    fs = cgi.FieldStorage()
        # checks if a url parameter exists in the GET request. If not, go to hell.
    url = fs.getvalue('url', "http://www.openlayers.org")

try:
    host = url.split("/")[2]

    # reply with HTTP 502 code if the host is not allowed
    if allowedHosts and not host in allowedHosts:
        print "Status: 502 Bad Gateway"
        print "Content-Type: text/plain"
        print
        print "This proxy does not allow you to access that location (%s)." % (host,)
        print
        print os.environ
    # checks if the request is a http or https request
    elif url.startswith("http://") or url.startswith("https://"):

        if method == "POST":
            length = int(os.environ["CONTENT_LENGTH"])
            headers = {"Content-Type": os.environ["CONTENT_TYPE"]}
            body = sys.stdin.read(length)
            r = urllib2.Request(url, body, headers)
            y = urllib2.urlopen(r)
        else:
            y = urllib2.urlopen(url)

        # print content type header
        i = y.info()
        if i.has_key("Content-Type"):
            print "Content-Type: %s" % (i["Content-Type"])
        else:
            print "Content-Type: text/plain"
        print
        print y.read()

        y.close()
    else:
        print "Content-Type: text/plain"
        print
        print "Illegal request."

except Exception, E:
    print "Status: 500 Unexpected Error"
    print "Content-Type: text/plain"
    print
    print "Some unexpected error occurred. Error text was:", E

This PHP script does exactly the same :

<?php
            /*
            License: LGPL as per: http://www.gnu.org/copyleft/lesser.html
            $Id: proxy.php 3650 2007-11-28 00:26:06Z rdewit $
            $Name$
            */

            ////////////////////////////////////////////////////////////////////////////////
            // Description:
            // Script to redirect the request http://host/proxy.php?url=http://someUrl
            // to http://someUrl .
            //
            // This script can be used to circumvent javascript's security requirements
            // which prevent a URL from an external web site being called.
            //
            // Author: Nedjo Rogers
            ////////////////////////////////////////////////////////////////////////////////

            // define alowed hosts
            $aAllowedDomains = array('localhost','my_target_server')

            // read in the variables

            if(array_key_exists('HTTP_SERVERURL', $_SERVER)){
                    $onlineresource=$_SERVER['HTTP_SERVERURL'];
            }else{
                    $onlineresource=$_REQUEST['url'];
            }
            $parsed = parse_url($onlineresource);
            $host = @$parsed["host"];
            $path = @$parsed["path"] . "?" . @$parsed["query"];
            if(empty($host)) {
                    $host = "localhost";
            }

            if(is_array($aAllowedDomains)) {
                    if(!in_array($host, $aAllowedDomains)) {
                            die("The '$host' domain is not authorized. Please contact the administrator.");
                    }
            }

            $port = @$parsed['port'];
            if(empty($port)){
                    $port="80";
            }
            $contenttype = @$_REQUEST['contenttype'];
            if(empty($contenttype)) {

                    $contenttype = "text/html; charset=ISO-8859-1";
            }
            $data = @$GLOBALS["HTTP_RAW_POST_DATA"];
            // define content type
            header("Content-type: " . $contenttype);

            if(empty($data)) {
                    $result = send_request();
            }
            else {
                    // post XML

                    $posting = new HTTP_Client($host, $port, $data);
                    $posting->set_path($path);
                    echo $result = $posting->send_request();
            }

            // strip leading text from result and output result
            $len=strlen($result);
            $pos = strpos($result, "<");
            if($pos > 1) {
                    $result = substr($result, $pos, $len);
            }
            //$result = str_replace("xlink:","",$result);
            echo $result;

            // define class with functions to open socket and post XML
            // from http://www.phpbuilder.com/annotate/message.php3?id=1013274 by Richard Hundt

            class HTTP_Client {
                    var $host;
                    var $path;
                    var $port;
                    var $data;
                    var $socket;
                    var $errno;
                    var $errstr;
                    var $timeout;
                    var $buf;
                    var $result;
                    var $agent_name = "MyAgent";
                    //Constructor, timeout 30s
                    function HTTP_Client($host, $port, $data, $timeout = 30) {
                            $this->host = $host;
                            $this->port = $port;
                            $this->data = $data;
                            $this->timeout = $timeout;
                    }

                    //Opens a connection
                    function connect() {
                            $this->socket = fsockopen($this->host,
                            $this->port,
                            $this->errno,
                            $this->errstr,
                            $this->timeout
                    );
                    if(!$this->socket)
                            return false;
                    else
                            return true;
                    }

                    //Set the path
                    function set_path($path) {
                            $this->path = $path;
                    }

                    //Send request and clean up
                    function send_request() {
                            if(!$this->connect()) {
                                    return false;
                            }
                            else {
                                    $this->result = $this->request($this->data);
                                    return $this->result;
                            }
                    }

                    function request($data) {
                            $this->buf = "";
                            fwrite($this->socket,
                            "POST $this->path HTTP/1.0\r\n".
                            "Host:$this->host\r\n".
                            "Basic: ".base64_encode("guillaume:catch22")."\r\n".
                            "User-Agent: $this->agent_name\r\n".
                            "Content-Type: application/xml\r\n".
                            "Content-Length: ".strlen($data).
                            "\r\n".
                            "\r\n".$data.
                            "\r\n"
                    );

                    while(!feof($this->socket))
                            $this->buf .= fgets($this->socket, 2048);
                            $this->close();
                            return $this->buf;
                    }


                    function close() {
                            fclose($this->socket);
                    }
            }



            function send_request() {
                    global $onlineresource;
                    $ch = curl_init();
                    $timeout = 5; // set to zero for no timeout

                    // fix to allow HTTPS connections with incorrect certificates
                    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
                    curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 1);

                    //curl_setopt($ch, CURLOPT_USERPWD, 'guillaume:catch22');
                    //curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC);

                    curl_setopt($ch, CURLOPT_URL,$onlineresource);
                    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
                    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
                    curl_setopt($ch, CURLOPT_ENCODING , "gzip, deflate");

                    if( ! $file_contents = curl_exec($ch)){
                            trigger_error(curl_error($ch));
                    }
                    curl_close($ch);
                    $lines = array();
                    $lines = explode("\n", $file_contents);
                    if(!($response = $lines)) {
                            echo "Unable to retrieve file '$service_request'";
                    }
                    $response = implode("",$response);
                    return utf8_decode($response);
            }
    ?>

Sum up

As seen, there are different strategies to choose from according to the streams you want to use and the kind of platform you are developing for. Using WMS in a web application will be easier than dealing with heavy WFS in an iOS app. One can however consider the most prominent approaches :

  • For simple images, without authentication, use the direct stream to our premises.
  • For heavy loaded text streams (WFS, JSON…), retrieve the data regularly and serve it from your own server. It can also allow you to avoid the use of a proxy script.
  • For nomads applications on smartphones, you should prioritize the autonomy of the application over the data access methods. Retrieve the data, implement a service able to list the available data in a way you can later add new data layers to your application without having to update and redeploy it.