united-coders.com

  • Authors
    • Christian Harms
    • Nico Heid
  • Datenschutz
  • Impressum

Regular Expression examples in Java

Posted on September 5, 2009 by Nico Heid Posted in Uncategorized

When I first had to use regular expressions in Java I made some fairly common mistakes.
Let’s start out with a simple search.

simple string matching

We want to search the string: asdfdfdasdfdfdf for occurences of dfd. I can find it four times in the String.
Let’s evaluate what our little Java program says.

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexCoding {

	public static void main(String[] args) {

		String source = "asdfdfdasdfdfdf";
		Pattern pattern = Pattern.compile("dfd");
		Matcher matcher = pattern.matcher(source);
		int hits = 0;
		while (matcher.find()) {
			hits++;
		}
		System.out.println(hits);

	}
}

The result should be 2. So either our code is wrong, or the logic works differently than expected. And indeed, it does.

The first important rule of regular expressions in Java is: The search runs from left to right, and if a character has been used in search it will not be reused. So when we see dfdfd we used the first three letters for the match, and only fd remains, which is no match any more.

getting IP addresses out of some input

Let’s pretend you application wants to read the IP addresses out of some garbled texted you got from a service.
First of all we have to define how an IP address can look like. We have 4 pairs of numbers separated by dots ranging from 0 to 255.

Let’s write this as a regular expression. This might not be the best solution, if you have any better or more elegant feel free to leave a comment.

Pattern: \b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b (this one is from http://www.regular-expressions.info/examples.html , thanks). This is IPv4 only.
It basically takes the allowed numbers for the first three blocks separated by dot and then quantified by 3. And the final block with the same pattern but without the trailing dot.

So let’s see if it works:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexCoding {

	public static void main(String[] args) {

		String logtext = "asdfesgewg 215.2.125.32 alkejo 234 oij8982jld" +
				"kja.lkjwech . 24.33.125.234 kadfjeladfjeladkj";
		
		String regexpatter = "\\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}" +
				"(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\b";
		
		Pattern pattern = Pattern.compile(regexpatter);
		Matcher matcher = pattern.matcher(logtext);
		
		while (matcher.find()) {
			System.out.println(logtext.substring(matcher.start(),matcher.end()));
		}
		

	}
}

As you can see, it prints the IP addresses found in the string.
You have to take care to escape the \ properly, otherwise you’ll end with an empty result or an invalid escape sequence error.

further reads

http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html
http://en.wikipedia.org/wiki/Regular_expression
http://www.gamedev.net/community/forums/topic.asp?topic_id=357270

java

Handling the unexpected – Type safe functions in Javascript

Posted on August 5, 2009 by Matthias Reuter Posted in Uncategorized

Javascript is a weird language. Great but weird. Take functions for example. You cannot only pass any type of arguments, you can also pass any number of arguments. This may be quite disturbing, especially for Java developers. Recently I had a discussion with a Java developer who complained about missing code assistance in Javascript, by which he meant no hint about the type of an argument.

This is of course due to the dynamically typed nature of Javascript. While in Java you denote the expected type of a parameter, and it’s the caller’s duty to pass the correct type (even more, you cannot pass a different type), in Javascript it’s the function’s duty to handle the given parameters and do something reasonable with unexpected types. The question arises: How do you write type safe functions in Javascript? Let me explain this by an example implementation to calculate the greatest common divisor of two numbers.

[The greatest common divisor of two integers is the largest integer that devides these integers without a remainder. The greatest common divisor (or gcd for short) of 24 and 15 is 3, since 24 = 3 · 8 and 15 = 3 · 5.]

This is a basic implementation to calculate the greatest common divisor. It’s based upon the fact that gcd(n,m) = gcd(n-m,m) and gcd(n,n) = n. This is a very easy algorithm, though not the fastest.

function greatestCommonDivisor (n, m) {
  if (n === m) {
    return n;
  }
  if (n < m) {
    return greatestCommonDivisor(n, m - n);
  }
  return greatestCommonDivisor(n - m, n);
}

So now, how do you make this function type safe? It's a function for integers, how do you handle, for example, strings?

Leave it to the user

The easiest answer is, you don't. Just leave the responsibility to the user. If he's to dumb to provide integers let him handle the consequences.

[By the way, what happens when calling this function with strings? Interestingly, you might get the right result,

greatestCommonDivisor("24", "15")

returns 3, but this is sheer luck.

greatestCommonDivisor("9", "24")

results in an infinite loop.]

The consequences could be (a) the right result (unlikely but possible), (b) an error thrown (ugly, but the user asked for it), (c) unexpected results like

NaN

,

null

or

undefined

(still ugly) or (d) an infinite loop. This last possibility should make us discard the easy answer. We might accept a (we surly would), b and c; d however is out of question.

Type check arguments

So the second answer is to type check the arguments. We want numbers, so we reject anything else. How do we reject unexpected types? Throw an error, at least for now.

function greatestCommonDivisor (n, m) {
  if (typeof n !== "number" || typeof m !== "number") {
    throw "Arguments must be numbers";
  )

  if (n === m) {
    return n;
  }
  if (n < m) {
    return greatestCommonDivisor(n, m - n);
  }
  return greatestCommonDivisor(n - m, n);
}

Convert arguments

While this is a valid solution, there are better. If n and m come from user input, it's likely they might be strings. Strings can be easily converted to numbers, so instead of rejecting them, we should convert them. This can be done by calling Number as a functíon:

function greatestCommonDivisor (n, m) {
  n = Number(n);
  m = Number(m);

  // We had to change that check, since Number() might return NaN
  if (isNaN(n) || isNaN(m)) {
    throw new TypeError("Arguments must be numbers");
  )

  if (n === m) {
    return n;
  }
  if (n < m) {
    return greatestCommonDivisor(n, m - n);
  }
  return greatestCommonDivisor(n - m, n);
}

Better design

So now you know three concepts of handling unexpected types: Ignore them (might result in infinitive loops, bad), reject them (why give up so easily, bad) and convert them (good). The concept might be good, but the implementation lacks something. I'll show you what.

First (and that is a minor point) the algorithm is bad. If you pass negative numbers or zero, it results in an infinitive loop. This can be solved by taking the absolute value (gcd(n,m) equals gcd(-n,m)).

Second (a minor point as well) the greatest common divisor expects integers but conversion to number results in float values. This can be solved by using

parseInt()

instead of

Number()

.

The third point is a major. The function greatestCommonDivisor is called recursively. That means in every call we type convert and check the arguments, although after the first call we know the arguments have the correct type. There is an elegant solution. First, we convert and check the arguments. Then we define an inner function to do the calculation, which is then called recursively. Since we provide the arguments for the inner function, we don't have to check them again there.

The fourth is a minor point again. When dealing with numbers I'd rather not throw an error but return

NaN

instead. If the user provides arguments which result in

NaN

, we should just tell him so.

Therefore we get the following solution to how to write a type safe function in Javascript:

function greatestCommonDivisor (n, m) {
  // convert n to an integer using parseInt,
  // take the absolute value to prevent infinitive loops
  n = Math.abs(parseInt(n, 10));
  
  // do the same for m
  m = Math.abs(parseInt(m, 10));

  // check if conversion lead to NaN
  if (isNaN(n) || isNaN(m)) {
    return NaN;
  )
  
  // prevent infinitve loop when one argument is zero
  if (n === 0) {
    return m;
  }
  if (m === 0) {
    return n;
  }
  
  // now the inner function
  var gcd = function (n, m) {
    // gcd(n,1) is 1, so prevent recursion to speed things up
    if (n === 1 || m === 1) {
      return 1;
    }
    if (n === m) {
      return n;
    }
    if (n < m) {
      return gcd(n, m - n);
    }
    return gcd(n - m, n);
  };
  
  // invoke the inner function
  return gcd(n, m);
}

There, we're done! Now... wouldn't it be great to make type checking automated? Something like

var typeSafeGcd = makeTypeSafe(greatestCommonDivisor, ["int", "int"]);
typeSafeGcd(21, 15);       // would still return 3
typeSafeGcd(21.5, "abc");  // would fail

Yes, this can be done and I will show you how. There are some drawbacks though. First, how do you automatically convert an unexpected type? It's easy for some types, for example if you expect a number. However, how do you convert a string to an array? Simply wrap the string? Split it somehow? Converting requires some thought and depends on the context. Therefore I think it's not a good idea to do so automatically. That leaves checking types and rejecting unsupported ones.

Second, how do you reject unsupported types? You could either return null, undefined or NaN (depending on the context) or throw an error. Both ways lead to the user having to check the results of function calls. In other words, you push the responsibility to provide correct parameters off to the user. Graceful conversion definitely is the better alternative.

So keep on reading, if you still want to know how to automatically make a function type safe.

This is heavy stuff. I do not expect you to understand it, if you are a beginner in Javascript. I try to explain everything in detail, so maybe you should give it a try. I've written a jQuery plugin that makes type safe functions, so if you don't understand how it's done, you can use it nonetheless...

So what do we want? We want some way to automatedly check the given arguments by just telling which types we expect. We want to reject all other values. We want to reject the incorrect number of arguments. Let's start step by step.

var makeTypeSafe = function (f, parameterList) {
    return f;
};

We define a

makeTypeSafe

-function that accepts two parameters: f, the function to make type safe, and parameterList, the list of expected types of arguments of f. This function does nothing so far except returning the original function.

Now reject the wrong number of arguments:

var makeTypeSafe = function (f, parameterList) {
    var p = parameterList.length;

    // return a function that first checks the arguments before calling the
    // original function
    return function () {
        // check number of arguments
        if (arguments.length !== p) {
            throw "Unexpected number of arguments. Expected " + p + ", got " + arguments.length;
        }
        
        // call f, passing the arguments, preserving the context
        return f.apply(this, arguments);
    };
};

Here we do no longer return the original function but a new function. This new function checks if the number of arguments (

arguments.length

) is different from the expected number of arguments (

parameterList.length

). If so, it throws an error. If not, f is called (and its return value returned).

Need an example of how to use it? Here you are.

var add = function (a, b) {
    return a + b;
};
var addIntegers = makeTypeSafe(add, ["int", "int"]);
addIntegers(21);        // will throw an error
addIntegers(21,15);     // will return 36
addIntegers("abc", 42); // will not throw an error

Why doesn't the last call throw an error? Because until now we only check the number of arguments, not the type. Let's change that.

var types = {
    "int" : function (n) {
        // by comparing n to its floor value we see if it's an integer
        return n === Math.floor(n);
    }
};

Here we define an object with one property, a function called int, which checks if a given argument is an integer. Since there is no integer type in Javascript, we do this by comparing n to its floor value (which is the same for integers).

Extending the

makeTypeSafe

function to check the types of arguments leads to the following code:

var types = {
    "int" : function (n) {
        return n === Math.floor(n);
    }
};

var makeTypeSafe = function (f, parameterList) {
    return function () {
        // check number of arguments
        if (arguments.length !== parameterList.length) {
            throw "Unexpected number of arguments. Expected " + p + ", got " + arguments.length + ".";
        }
        
        // check every argument using the types functions defined above
        for (var i = 0, l = arguments.length; i < l; i++) {
            if (!types[parameterList[i]](arguments[i])) {
                throw "Invalid argument at " + i + ". Argument must be of type " + parameterList[i] + ".";
            }
        }
        
        // call original function
        return f.apply(this, arguments);
    };
};

That's basically the way to automatedly check parameters before executing the original function. Now we can extend the types object to accept more types:

var types = {
    "int" : function (n) {
        // by comparing n to its floor value we see if it's an integer
        return n === Math.floor(n);
    },
    
    "double" : function (n) {
        // NaN is a number as well, so check that n is not NaN
        return typeof n === "number" && !isNaN(n);
    },
    
    "string" : function (n) {
        return typeof n === "string";
    }
};

We could even add more sophisticated types like natural numbers or arrays of integers.

types["natural"] : function (n) {
    // replace > by >= if 0 is natural to you
    return types["int"](n) && n > 0;
};

types["int[]"] : function (n) {
    // accept only arrays
    if (!(n instanceof Array)) {
        return false;
    }
    // check if every element is an integer
    for (var i = 0, l = n.length; i < l; i++) {
        if (!types["int"](n[i])) {
            return false;
        }
    }
    return true;
};

"Make a jQuery plugin!" my fellow co-author Christian cried. Well, personally I don't fancy jQuery, but if you must make a function type safe, you might be using jQuery anyway, so why not? I extended the list of build-in types to support int, double (alias float), string, boolean, char, object and arrays of those types as well (used via int[], double[] etc). Furthermore, the list of of expected parameters is checked for unknown types.

I solved the original function to calculate the greatest common divisor

// Define internal gcd function
var gcd = function (a, b) {
    if (a === b) {
        return a;
    }
    if (a === 1 || b === 1) {
        return 1;
    }
    if (a < b) {
        return gcd(a, b - a);
    }
    return gcd(a - b, b);
};

// extend the known types to support natural numbers
jQuery.makeTypeSafe.types["natural"] = function (n) {
    return jQuery.makeTypeSafe.types["int"](n) && n > 0;
};

// make gcd type safe
var greatestCommonDivisor = jQuery.makeTypeSafe(gcd, ["natural", "natural"]);

// call type safe gcd function
greatestCommonDivisor(21, 15);     // returns 3
greatestCommonDivisor("21", "15"); // throws an error

Any questions, any remarks? Feel free to comment!

Combining HTTP and JavaScript APIs with python on google appengine

Posted on July 30, 2009 by Christian Harms Posted in Uncategorized

In this part I will introduce the python implementation of the ip to geolocation script. It’s more object oriented and hopefully better to read. In the first part of this article I willdescribe the solution to read http resources and parse the content. The second part is the same like the php version. As conclusion I will compare the results of all five APIs with the data from the cache.

  • the tutorial how to use ip to geolocation provider api
  • the free usage ip to geolocation aggregator script
  • the php stand alone implementation
  • the python google appengine implementation

python: asynchonous http requests on google appengine

First I want to build the same multi-url-fetch-function (like in php) but I am using the google appengine. There are no threads allowed and the urllib/httplib modules are masked with the urlfetch module from google. I chose the normal and easy urllib.open call because the google backend works fast. After this was done I found in the updated URLFetch-documentation (since June 18, 2009 or appengine version 1.2.3) the section that said: “To do asynchronous calls you have to use the special modul from the urlfetch modul“. Have fun with the improved example.

from google.appengine.api import urlfetch

class InfoItem(dict):
    '''dict with start reading while __init__ the ipinfodb '''
    def __init__(self, url):
        self.rpc = urlfetch.create_rpc()
        urlfetch.make_fetch_call(self.rpc, url)

    def ready(self):
        '''Check if the async call is ready.
        @return True - if got data after parsing
        '''
        try:
            result = self.rpc.get_result()
        except urlfetch.Error, ex:
            logging.error("Error while fetch: %s" % ex)
            return False

        if result.status_code != 200:
            return False
        return self.parse(result.content)
    #ready
#InfoItem

For easy access the result class is based on a python dict. To check if the api data is filled in the dict call the ready() function. You can build the instances of InfoItem, do something other and then ask the instances with the ready-function, if the data has arrived (if not it will wait). Accessing the values is easy because it’s a dict.

Parsing XML data with python

xml should be parsed with the elementtree modul. Its very fast and simple to use. Using the InfoItem class there are two jobs: building the url to the api by simple adding the ip string and parsing the content.

import xml.etree.ElementTree as etree
from xml.parsers.expat import ExpatError

class IpInfoDbItem(InfoItem):
    '''Simple parsing the content of the IpInfoDP-API'''
    def __init__(self, ip):
        '''Init with the IpInfoDb-url'''
        super(IpInfoDbItem, self).__init__("http://ipinfodb.com/ip_query.php?ip="+ip)

    def parse(self, content):
        '''Parse the IpInfoDb-XML and save the keys in the inner dict.
        @return True - if parsing was successfull.
        '''
        try:
            #etree needs a file-like-object instead a string!
            t = etree.ElementTree().parse(StringIO.StringIO(content))
            self.update({'name': 'ipinfodb',
                         'country': t.find("CountryName").text or '',
                         'city':    t.find("City").text or '',
                         'lat':     float(t.find("Latitude").text),
                         'long':    float(t.find("Longitude").text)})
            return True
        except (ExpatError, IOError), ex:
            logging.warn("Nothing parsed: %s" % ex)
        return False
    #parse
#IpInfoDbItem

#Test the code directly (if google modules are in the path)
testing = IpInfoDbItem("127.0.0.1")
if testing.ready():
    print testing
#  {'lat': 0.0, 'country': 'Reserved', 'name': 'ipinfodb', 'long': 0.0, 'city': None}

The example starts fetching the data from the IpInfoDb-API in the __init__ function, parses the xml und fills the values in the dict with self.update.

Parsing non-structured data with python

The same hint like in php – use regular expressing for matching the data!

import re

class HostIpItem(InfoItem):
    '''dict with reading while __init__ the hostip '''
    def __init__(self, ip):
        super(HostIpItem, self).__init__("http://api.hostip.info/get_html.php?position=true&ip="+ip)

    def parse(self, content):
        '''Parse the HostIp-Text and save the keys in the inner dict.
        @return True if parsing was successfull.
        '''
        match = re.search("Country:\s+(.*?)\(\w+\)\nCity:\s+(.*?)\nLatitude: (-*\d+\.\d+)\nLongitude: (-*\d+\.\d+)", content, re.S|re.I)
        if match:
            self.update( {'name': 'hostip',
                          'country': match.group(1),
                          'city': match.group(2),
                          'long': float(match.group(4)),
                          'lat': float(match.group(3))})
            return True
        return False
    #parse
#HostIpItem

Works like the xml example …

Build a complete webapplication

To put this together you have to define a RequestHandler, who fetches the data and produces a javascript. In django style you need the following template, the values in {{ x }} will be replaced with a dict.

var com = com||{};
com.unitedCoders = com.unitedCoders||{};
com.unitedCoders.geo = com.unitedCoders.geo||{};
com.unitedCoders.geo.ll = {{ ll_json }} ;

{{ maxmind }}
{{ wipmania }}
{{ google }}
document.write('<script type="text/javascript" src="http://pyUnitedCoders.appspot.com/geo_func.js"></script>');

com.unitedCoders.geo.staticMapUrl = function(x, y) {
  var url = "http://maps.google.com/staticmap?key={{ google_key }}&size="+x+"x"+y+"&markers=";
  var colors = ["blue","green","red","yellow","white", "black"];
  for (var i=0; i<com.unitedCoders.geo.ll.length;i++) {
      var s = com.unitedCoders.geo.ll[i];
      url += s.lat+","+s.long+",mid"+colors[i]+(i+1)+"%7C";
  };
  url += this.getLat() + ","+this.getLong() + ",black";
  return url;
};
from google.appengine.ext import webapp
from google.appengine.ext.webapp.util import run_wsgi_app
from google.appengine.ext.webapp import template

class GeoScript(webapp.RequestHandler):
    def get(self):
        '''Get the location infos for the calling ip (from api).'''

        self.response.headers['Content-Type'] = 'text/plain;charset=UTF-8'

        #result-dict and local location list
        result = {}
        ll = []

        #Start fetching API data
        ipInfo = IpInfoDbItem(ip)
        hostIp = HostIpItem(ip)

        #Add some more Javascrip APIs
        scriptTemp = "document.write('<script type=\"text/javascript\" src=\"%s\"></script>');"
        result['maxmind'] = scriptTemp % "http://j.maxmind.com/app/geoip.js"
        result['wipmania'] = scriptTemp % "http://api.wipmania.com/wip.js"
        if self.request.get("key"):
            result['google_key'] = self.request.get("key")
            result['google'] = scriptTemp % \
                      ("http://www.google.com/jsapi?key=" + self.request.get("key"))

        #Get the fetched API Data
        if ipInfo.ready():
            ll.append(ipInfo)
        if hostIp.ready():
            ll.append(hostIp)

        result['ll_json'] = encoder.JSONEncoder().encode(result['ll'])

        #Put all together in the javascript template
        path = os.path.join(os.path.dirname(__file__), 'geo.temp')
        self.response.out.write(template.render(path, result))
    #get
#GeoScript

application = webapp.WSGIApplication([
        ('/geo_data.js', GeoScript) ], debug=True)

For more information on how to start a python google appengine Webapplication start reading the fine google documentation!

conclusion

Don't mix too many languages - you will be confused! The parallel implementation of the server side script in php and python and using one version for the advanced functions in javascript will mix three script language! My first failures have been setting some semicolons in python or forgetting the block parentheses in javascript.

After deploying and watching a server side script with the dashboard of google's appengine you get all data: Logs and API-calls in detail, you can manage many different versions (default is one) or give deploy access to other google accounts:That's great!

What is the best service provider?

I have done some caching and checked all five API results. Here is the hit rate for the location of visitors of this blog and the distance to the center from the given locations (The center is the average of all long/lat values pairs with a given city value per ip).

 Service Provider   lat/long per ip   city per ip   distance to center 
maxmind 86%  85%  123 km
WIPmania 89%  0%  1059 km
google 48%  0%  197 km
IPinfoDB 98%  91%  168 km
hostip 35%  53%  404 km

HostIP and google do not offer location data for many visitors. IPInfoDB and MaxMind do not have the same positions (like suggested in the comments) for all IPs. At this time WIPMania mainly offers the center of the country. So the positions are not very accurate (in comparision to the calculated center).

How calculate the distance between lat/long values?

I found some nice functions in javascript (please don't add functions to the prototype to String and Integer!!!), in python the distance function looks like the following lines:

def distance(lat1, long1, lat2, long2):
    return 6378.7 * math.acos(math.sin(lat1/57.2958) * math.sin(lat2/57.2958) + math.cos(lat1/57.2958) * math.cos(lat2/57.2958) * math.cos(lon2/57.2958 - lon1/57.2958))
#distance

Combining HTTP and JavaScript APIs with php

Posted on July 27, 2009 by Christian Harms Posted in Uncategorized

The implementation of the ip to geo location script started as a quick hack in php. Sometimes I use php scripts because everyone can include it in his own website account (but I am not an active php programmer). It’s much easier to write a simple php script instead of deploying a more complex web application. The quick result was this implementation as an example. After getting the php script to work I rewrote it in python, transformed it into a google appengine application and deployed it. Both worked fine and now there are two examples for the API integration available.

technorati code jxzmcfgp74
  • the tutorial how to use ip to geolocation provider api
  • the free usage ip to geolocation aggregator script
  • the php stand alone implementation
  • the python google appengine implementation

php and http API / RESTfull web services

To fetch http resources use the the curl module. It’s robust and there aren’t too many possibilities to break it. If you want to use more than one service provider’s API you can access them in a parallel fashion. So the total execution time for the request will be equal to the slowest provider’s roundtrip time instead of the sum of all roundtrip times together.

  function readUrls($urls) {
    $req = array();
    $mh = curl_multi_init();
    for($i = 0; $i<pre class="prettyprint"> 0);

    //read the content
    $ret = array();
    for($i = 0; $i<count($urls); $i++) {
      $ret[$i] = curl_multi_getcontent($req[$i]);
      curl_multi_remove_handle($mh, $req[$i]);
    }
    curl_multi_close($mh);
    return $ret;
  }

This simple function gets an array of urls and returns an array of the content. Ok - only the optimistic way without error detection!

Basic XML parsing with PHP

First you have to check if the Content-encoding is set correctly. If not you'll have to convert the http-content to the encoding declared within the xml declaration, which is UTF-8 in the example below.

<?xml version="1.0" encoding="UTF-8"?>
<Response>
        <Ip>74.125.45.100</Ip>
        <Status>OK</Status>
        <CountryCode>US</CountryCode>
        <CountryName>United States</CountryName>
        <RegionCode>06</RegionCode>
        <RegionName>California</RegionName>
        <City>Mountain View</City>
        <ZipPostalCode>94043</ZipPostalCode>
        <Latitude>37.4192</Latitude>
        <Longitude>-122.057</Longitude>
</Response>

And now the simple php code for xml parsing:

if (substr($content,1,5)=='?xml') {
    $xml = simplexml_load_string(utf8_encode($content));
    if ($xml && $xml->City) {
    //Add the geo data to the result set
    $result =  array("name" => "ipinfodb",
                      "country" => strval($xml->CountryName),
                      "city" => strval($xml->City),
                      "long" => floatval($xml->Longitude),
                      "lat" => floatval($xml->Latitude)  );
}

php and parsing non-structured data

For non-structured data regular expressions are the best solution if you can beat the beast. Otherwise try it with several split-actions.

Country: UNITED STATES (US)
City: Sugar Grove, IL
Latitude: 41.7696
Longitude: -88.4588

And now the php code for text parsing:

if (preg_match("/Country:\s+(.*?)\(\w+\)\nCity:\s+(.*?)\nLatitude: (-*\d+\.\d+)\nLongitude: (-*\d+\.\d+)/", $content, $ma)==1) {
    $result = array("name" => "hostip",
                              "country" => $ma[1],
                              "city" => $ma[2],
                              "long" => floatval($ma[4]),
                              "lat" => floatval($ma[3])  
};

php and parsing JSON data

Json is defined as the JavaScript object notation and is an universal key-value transfer format, especially in web applications. So many API (like the IPInfoDB from the xml example) offer these formats and php has the build-in function to parse it. The encoding is defined as utf-8!

if (substr($content,0,1)=='{') {
    $json =  json_decode($content);
    if ($json && $json->{'City'}) {
        $result = array("name" => "ipinfodb",
                                  "country" => $json->{'CountryName'},
                                  "city" => $json->{'City'},    
                                  "long" => floatval($json->{'Longitude'}),
                                  "lat" => floatval($json->{'Latitude'})));
    }
}

Getting the javascripts data together

The described php code should produce a simple javascript (after running the above script) containing the fetched data and including the other javascripts via document.write function like this output:

var com = com||{};
com.unitedCoders = com.unitedCoders||{};
com.unitedCoders.geo = com.unitedCoders.geo||{};
<?php
  //build your result
  echo "com.unitedCoders.geo.ll = [".json_encode($result)."];\n";
?>
document.write('<script type="text/javascript" src="http://j.maxmind.com/app/geoip.js"></script>');
document.write('<script type="text/javascript" src="http://api.wipmania.com/wip.js"></script>');
document.write('<script type="text/javascript" src="geo.js"></script>');

After the other javascript sources are loaded (sequential) the geo_funcs.js javascript can map from the internal structures to the global com.unitedCoders.geo.ll list and add some helper functions. For example the "get the middle position" of all lat/long values.

add the external javascript data to the result

The maxmind javascript api has as result a set of global functions. Here the maxmind part of geo_funcs.js:

if (window['geoip_country_name']) {
    com.unitedCoders.geo.ll.push({
            "name": "maxmind",
            "country": geoip_country_name(),
            "city": geoip_city(),
            "long": parseFloat(geoip_longitude()),
            "lat": parseFloat(geoip_latitude()) });
};

//calculate the middle point of all position where city is set
(function() {
    var lat = 0.0;
    var long = 0.0;
    var count = 0;
    for (var i=0; i<com.unitedCoders.geo.ll.length;i++) {
        var service=com.unitedCoders.geo.ll[i]
        if (service.long && service.lat && service.city) {
            lat += parseFloat(service.lat);
            long += parseFloat(service.long);
            count ++;
       }
    }
    com.unitedCoders.geo.lat = lat/count;
    com.unitedCoders.geo.long = long/count;
})();

The function wrapper chooses all positions with a set city, hoping these lat/long value pairs are the most accurate positions and calculate the middle point. After testing I noticed the google positions are accurat (but no city information is available).

In the next article I will describe the same functionality but implemented in python for the google appengine.

All about types in Javascript – Type detection

Posted on July 3, 2009 by Matthias Reuter Posted in Uncategorized

This is the fourth (and last) part of a series “All about types” in Javascript.

  1. The basics – types and general rules of conversion
  2. Automatic type conversion
  3. Explicit type conversion
  4. Type detection

The drawback of having a loosely typed language like Javascript, is that you sometimes have to determine the current type of a variable. This mostly occurs when you create a function that accepts different types of parameters – or is limited to one type.

A while ago, I had a function to open up a dialog to allow the user to send an email. That function accepted a parameter recipient to prefill the to-field. For convenience, I accepted a string and an array as that parameter. If an array was given, I had to convert it to a string. So I needed to check, if I really had been given an array.

Unfortunately, detecting an array is the most complex task of type detection. So I will postpone my solution until later and start with the easier ones.

Basic type detection

Of the five types null, undefined, boolean, number and string, four are very easily detected by using the typeof operator.

typeof "some text"; // "string"
typeof 42;          // "number"
typeof true;        // "boolean"
typeof undefined;   // "undefined"

Detecting an object is similar:

typeof {};    // "object"

Detecting null

Now the problems begin. Some special values require special handling. Take null, for example. The type of null unfortunately is not “null”, it is “object”:

typeof null;   // "object"

To test if a variable is null, you have to compare it to null:

obj === null;  // true, if obj is null, otherwise false

Detecting NaN

The second issue comes with NaN. Its type is “number”, although NaN is Not A Number. Funny, isn’t it? Comparing to NaN does not help either:

NaN === NaN;  // false

I have absolutely no idea, why the language specification states to ECMAScript code, all NaN values are indistinguishable from each other.. Fortunately, there is a solution. There is a global function called isNaN that returns true, if the supplied argument is NaN and false otherwise.

isNaN(NaN);  // true
isNaN(42);   // false

An object is an object is an object

Yes, at least for type detection. It does not matter if your object is an array, a date, an HTML-Element or anything else, typeof always returns “object”.

typeof new Date();  // "object"
typeof [];          // "object"
typeof {};          // "object"

Let me come back to my introductive function. I had to test, if a parameter was an array or a string. Since we don’t have a solution for detecting arrays, my first idea was circumvent the problem

if (typeof recipient !== "string") {
  recipient = recipient.join(",");
}

but that was a bad idea. It lacks robustness, since my function would throw an error, if recipient was e.g. a number. Numbers don’t have a join method.

So I really had to detect the type. As an advanced developper, I knew of two ways: duck-typing and the use of the instanceof operator.

Duck typing

What’s duck typing? “If it walks like a duck and quacks like a duck, I would call it a duck.” (attributed to James Whitcomb Riley). In other words, if the object has a join method, a length and a sort method, call it an array.

In my case, I did this:

function sendMessage (recipient) {
  if (recipient && recipient.join) {
    recipient = recipient.join(",");
  }
}

I only needed to know if the object had a property join, since this was the method I needed to call. That may differ in other situations. Maybe you need to check for sort or splice (both suggesting strongly the object is an array), maybe you only need a length property, in which case your function might also be applicable to strings.

The instanceof operator

Another way to detect the kind of an object is the instanceof operator. It returns true, if the left-hand operand is an instance of the right-hand operand:

[] instanceof Array;        // true
[] instanceof Object;       // yes, true as well
new Date() instanceof Date; // true

Therefore I could have written

function sendMessage (recipient) {
  if (recipient instanceof Array) {
    recipient = recipient.join(",");
  }
}

and mostly, that works fine. Not in my situation. The stated method is called from within an iframe, and that means from a different context. Each window has its own global object, and Array is a constructor property of the global object.

Thus Array in recipient instanceof Array refers to the array constructor of the outer window, while recipient is an instance of the array constructor of the iframe’s window.

Detecting the kind of an object using Object.prototype.toString

That was when I discovered a third way to detect the kind of an object. kangax wrote about using the Object.prototype.toString method.

In summary, if Object.prototype.toString is called on an object, it results in a string like “[object Object]”, where Object refers to the Class of the object (“Class” is what I called “kind”). So, calling Object.prototype.toString on an array, results in “[object Array]”:

Object.prototype.toString.call([]) === "[object Array]";  // true

Please note that you have to use the call method of the toString method. The following does not work:

Object.prototype.toString([]) === "[object Object]";

This approach works with Date, RegExp and other constructors, too.

Detecting functions

One exception to the “an object is an object” mantra are functions. Though functions in javascript are objects as well, the type of a function is “function”. So my solution of duck-typing would better be written as this:

function sendMessage (recipient) {
  if (recipient && typeof recipient.join === "function") {
    recipient = recipient.join(",");
  }
}

Summary

There is no consistent way to detect the type of a variable or the kind of an object. That’s why many multi-purpose-libraries offer a function to wrap type detection.

By the way, in the meantime I found another solution to my problem by circumventing type detection:

function sendMessage (recipient) {
  recipient = recipient ? recipient.toString() : "";
}

If recipient is an array, the toString method returns a string containing the values, seperated by a comma, which is exactly that I needed.

That concludes my series about types in Javascript.

Tutorial for a IP to geolocation aggregator script

Posted on July 2, 2009 by Christian Harms Posted in Uncategorized

If you provide a restaurant guide it would be great to show the next restaurant based on webpage visitors position or the local beer garden specials if its sunny weather? Or offer geo targeted ads on your page? And this could be offered without registration or connection to a social network? I will describe the api/implementation details and offer directly our free “ip to geolocation aggregator”-script.

The first step is determining geo location based on Internet IP address of the visitor. I have found five free service providers which offer data based on the client-IP the geo position and city/country data. Classic “ip to geolocation data offers” are commercial. You have to buy and download a sql dump or csv file with ip ranges from isp with country and city data – some expanded with long/lat values. The second solution (and to save hosting space) is using a http api which offer theses data directly when being called – these looks like the preferred possibility. And the third way is to include a javascript which can integrate the geo position directly without need of a server component.

Hostip.info

HostIP is a community-powered database of IP mapping with a simple REST API. But you can also get the database dump if you have a local database and want to integrate . Simple start a HTTP request with the IP of your visitor and get the following 4-line-result:

  • URL:
    http://api.hostip.info/get_html.php?ip=xxx&position=true
  • Country: UNITED STATES (US)
    City: Sugar Grove, IL
    Latitude: 41.7696
    Longitude: -88.4588
    

For simple integration I will use php lines, php output will be javascript: Fetch the data with the curl module and parse it with a simple regex!

preg_match("/Country:\s+(.*?)\nCity:\s+(.*?)\nLatitude: (\d+\.\d+)\nLongitude: (\d+\.\d+)/", $DATA, $ma);
echo "  hostip:{country: '".$ma[1]."', city: '".$ma[2]."', long: ".$ma[4].", lat:".$ma[3]."}";

IPInfoDB.com

At ipinfodb (or better known as iplocationtools.com, found via programmableweb.com) you can find some free service to get the position for an IP via API webservice (limitation is 500 connections at a rate faster than 3 per second during a 1h period) or all the data as a sql dump. I choose the API webservice (easy to integrate) with XML-Output, because the city names are save with xml and utf-8 encoding!

  • URL:
    http://iplocationtools.com/ip_query.php?ip=xxx
  • <?xml version="1.0" encoding="UTF-8"?>
    <Response>
    	<Ip>74.125.45.100</Ip>
    	<Status>OK</Status>
    	<CountryCode>US</CountryCode>
    	<CountryName>United States</CountryName>
    	<RegionCode>06</RegionCode>
    	<RegionName>California</RegionName>
    	<City>Mountain View</City>
    	<ZipPostalCode>94043</ZipPostalCode>
    	<Latitude>37.4192</Latitude>
    	<Longitude>-122.057</Longitude>
    </Response>
    

What do to? Fetch the data and use the simple xml parser to fetch the attributes. php output will be javascript.


$xml = simplexml_load_string($DATA);
echo "  iplocationtools: {country: '".$xml->CountryName."', city: '".$xml->City."', long: ".$xml->Longitude.", lat:".$xml->Latitude."}";

MaxMind.com

MaxMind.com doesn’t offer a free Webservice like the others. You have to download a binary package with all IP related data for your preferred language. That isn’t as convenient as calling an REST interface.

Our iplocation script will include the javascript url as dynamic loaded javascript. After loading the dynamic parts it looks in the global javascript context for the maxmind functions and include the values in the result object.

  • URL: http://j.maxmind.com/app/geoip.js
  • function geoip_country_code() { return 'DE'; }
    function geoip_country_name() { return 'Germany'; }
    function geoip_city() { return 'Stuttgart'; }
    function geoip_region() { return '01'; }
    function geoip_region_name() { return 'Baden-Württemberg'; }
    function geoip_latitude() { return '48.7667'; }
    function geoip_longitude() { return '9.1833'; }
    function geoip_postal_code() { return ''; }
    

This javascript interface is ugly and the data defing global functions, which can’t be removed from the global namespace.

<script type="text/javascript" src="http://j.maxmind.com/app/geoip.js"></script>
<script type="text/javascript"> 
if (window.geoip_country_name) {
  com.unitedCoders.geo.ll.push({
      'name':    'maxmind', 
      'country': geoip_country_name(),
      'city':  geoip_city(),
      'long':  parseFloat(geoip_longitude()), 
      'lat':   parseFloat(geoip_latitude())    });
};
</script>

Here the Javascript code to add the maxmind values in the locationlist of our the global javascript object.

Google AJAX API

The google AJAX API offers a basic loader for many popular javascript libs. I found a nice function at startup: a global google.loader.ClientLocation – object. After some tests the city parameter is not as accurate as in the other location services, but the lat/long values are fine!

<script  type="text/javascript" src="http://www.google.com/jsapi?key=###api-key###"></script>
<script  type="text/javascript">
  if (window['google'] && google.loader.ClientLocation) {
    com.unitedCoders.geo.ll.push({
      'name':    'google', 
      'country': google.loader.ClientLocation.country,
      'city':       google.loader.ClientLocation.city,
      'long':      google.loader.ClientLocation.longitude, 
      'lat':        google.loader.ClientLocation.latitude    });
  };
</script>

To include the google parameters the google api key must be generated for the domain where the javascript will be used. For this tutorial I used unitedcoderscom.appspot.com as the domain.

WIPmania Location Service

And WIPmania offers a javascript API based on a global javascript object, integration is the same as with the google api.

<script type="text/javascript" src="http://api.wipmania.com/wip.js"></script>
<script  type="text/javascript">
if (window['WIPlocation']) {
   com.unitedCoders.geo.ll.push({
           'name':    'WIPmania',
           'country': WIPlocation.address.country,
           'city':  WIPlocation.address.city,
           'long':  WIPlocation.longitude,
           'lat':   WIPlocation.latitude    });
};
</script>

The same integration like the google api but no need for a application key! But after some tests it turned out that the location data is not too precise!

Geo Location API in the browser

After some googling I found a hint from Adam to get the geo position directly via javascript if the browser is running on a mobile device. The iphone implementation is available with the OS 3.0 update since June 17th 2009. In the future there are also plans to include this with a official W3C API that differs only in the position object. Both implementations provide only the lat/long values not the nearest city’s name. Including looks simple like the other javascript services described below:

if (window['navigator'] && navigator['geolocation']) {
  navigator.geolocation.getCurrentPosition(function(pos) {
    if (pos['coords']) {
      com.unitedCoders.geo.ll.push({
              name: 'w3c geo-api',
              long: pos.coords.longitude,
              lat: pos.coords.latitude });
    } else {
      com.unitedCoders.geo.ll.push({
              name: 'safari on iphone',
              long: pos.longitude,
              lat: pos.latitude });
    };
  });
};

An automatically integration is not useful because there starts a ugly popup dialog to ask the user if the location service should be available for the safari application and then for my domain.

Combine and Improve it

Every IP2geolocation service provider maintain his own database. I got at home three different towns with coordinates that differs for 50 kilometers. To improve this you can choose the two best positions (and discard the rest) or calculate the center point of all results. For some countries the geo location providers knows only the center point of the country so I have chosen only locations with a given city and calculated the middle point of these records for the lat/long values. These values will come with the lat/long-functions.

var com.unitedCoders.geo = {
  ll: [ { name: 'maxmind', country: 'Germany', city: 'Stuttgart', long: 9.1833, lat: 48.7667}, 
        { name: 'iplocationtools', country: 'Germany', city: 'Bietigheim-bissingen', long: 9.1333, lat:48.9667},
        { hostip: {country: 'GERMANY (DE)', city: 'Karlsruhe', long: 8.4, lat:49.05},
      ],
  getLong: function(), //8.905533333333333,
  getLat: function() // 48.9278
};

After that I have hopefully a more precise position of the visitor than using only one geo location provider’s api. Now I need some services to get content based on this information. For integration of the script I choose to display the position on a map as a simple example.

Show the google map

To visualize the positions on a map the full featured googlemap is too big – I am prefering the static google map. It is just a dynamically build img-url with some parameters (get for your domain a google map api key) which returns a static image.

  • URL: http://maps.google.com/staticmap?size=400×300&markers={latitude},{longitude},{size}{color}{alphanumeric-character}

If nothing other given the map will center all markers and fit the zoom level. The API is limited to 1000 calls/hour (of different visitors) but google will be nice in the limits.

And now : I know where you are!

How to integrate all these geo location API?

In Part 2 I will explain with a short php script and a little python script (in the google app engine) the implementation details. For free usage you can include the ip2geoLocation aggregator with the simple integration via one javascript.

If you find more free ip-to-geolocation service proiver feel free to comment this article so we can try to integration the service.

Maven 2 (Part 3): Configuring eclipse for Apache Maven 2 projects

Posted on June 19, 2009 by Phillip Steffensen Posted in Uncategorized

Today we’re going on with the third part of our Maven 2 tutorial series. Because of the comment of Enrico I decided that this article will focus on how to configure eclipse for the usage of Maven 2 projects and how to generate the eclipse-specific files by using Maven 2. I will show these things by using the example project of part 1 and part 2 of our Maven 2 tutorial series.

As you remember I showed you some simple Maven tasks like

  • mvn clean
  • mvn test and
  • mvn package

Today I will focus on the two other tasks

  • mvn eclipse:clean and
  • mvn eclipse:eclipse

Each of these two tasks are combining two informations. The first information is located before the colon-symbol: „eclipse“. This string defines that the Maven 2 eclipse plugin will be used. Maven 2 will automatically download this Plugin from the central Maven repository. The string after the colon-symbol („clean“ or „eclipse“) indicates the explicit target the Maven 2 eclipse plugin should execute.

Cleaning eclipse resources

Please navigate to the root directory of our example application and run

mvn eclipse:clean

If you now list the root directory of our project you will see that nothing has been changed. Why? This is quiet easy. If you run „mvn eclipse:clean“ Maven will use the Maven 2 eclipse plugin to remove the eclipse resources like the .settings-folder, the .project-file and the .classpath-file. In our case nothing has been deleted because our project never had eclipse-specific resources. To see how Maven 2 generates those things please go on reading this article. 😉

Generating project-specific eclipse resources

Generating project-specific eclipse-resources is as easy as removing them. Simply navigate to your projects root directory and run

mvn eclipse:eclipse

Maven will now generate all eclipse resources needed to import the project into the eclipse IDE. Now the project is ready to become imported. But is eclipse ready to use it?

Important. Important! IMPORTANT! Please be so kind and do never commit these eclipse resources into your versioning system (e.g. subversion, cvs, git or mercurial). Do it only if you really hate your colleagues.

Importing the project

The following informations could be a little bit boring because it contains a small click-guidance on how to import the project to the eclipse IDE. Let’s start by running our Maven 2 tasks again:

mvn eclipse:clean eclipse:eclipse

Now we ensured that all previous eclipse resources are removed and generated again. Open your eclipse IDE and click on „File –> Import“. Choose “Existing Projects into Workspace” and hit the “Next >”-button.

<img src="http://www.unitedcoderscom.appspot.com/sites/default/files/imagepicker/p/Phillip%20Steffensen/01_import.png" alt="Existing Projects into Workspace" /></center>
Make that the radiobutton is switched to "Select root directory" and hit the "Browse..."-button.

<pre class="prettyprint"><img src="http://www.unitedcoderscom.appspot.com/sites/default/files/imagepicker/p/Phillip%20Steffensen/02_import.png" alt="Select root directory" /></center>
Navigate to your projects parent folder and select your project. Then hit the "OK"-button.

<pre class="prettyprint"><img src="http://www.unitedcoderscom.appspot.com/sites/default/files/imagepicker/p/Phillip%20Steffensen/03_import.png" alt="Navigate to project" /></center>
Make sure that the checkbox in front of your project is checked and press "Finish". Now eclipse starts to import the new project.

<pre class="prettyprint"><img src="http://www.unitedcoderscom.appspot.com/sites/default/files/imagepicker/p/Phillip%20Steffensen/04_import.png" alt="Finish" /></center>
As you can see your project cannot be build. Why? Your eclipse IDE is missing the local Maven 2 repository in your build path.

<h2>Adding the local Maven 2 to eclipse build path</h2>

To let eclipse know where the local Maven 2 repository is located you should know where it is located. By default the Maven 2 repository is located at

<pre class="prettyprint">/home/[USERNAME]/.m2/repository

on linux platforms and

C:\[PATH_TO_YOUR_HOME_FOLDER]\.m2\repository

on Windows platforms. It is important to configure this directory as a classpath variable in eclipse. To do so please click on "Window --> Preferences" in your eclipse IDE. Navigate to "Classpath variables" of the "Java"- subnode "Build Path" and press the "New..."-button. Enter "M2_REPO" as the name of the new classpath variable and the location of your Maven 2 repository as the associated path.

<img src="http://www.unitedcoderscom.appspot.com/sites/default/files/imagepicker/p/Phillip%20Steffensen/05_buildpath.png" alt="Create classpath variable" /></center>

After that hit "OK" to create your classpath variable and again "OK" to apply your changes to the IDE preferences. Your eclipse IDE will now ask you to rebuild all projects because your classpath variables has been changes. Choose "Yes"!

<pre class="prettyprint"><img src="http://www.unitedcoderscom.appspot.com/sites/default/files/imagepicker/p/Phillip%20Steffensen/06_buildpath.png" alt="Create classpath variable 2" /></center>

As you can see your project has been successfully imported and you are able to start working on it.

<pre class="prettyprint"><img src="http://www.unitedcoderscom.appspot.com/sites/default/files/imagepicker/p/Phillip%20Steffensen/07_finished.png" alt="Project has no errors" /></center>

For those of you who like some further informations I've added 3 hints to this article. 

<h2>Hint: Relocating the local Maven 2 repository</h2>

Sometimes the default location of the Maven 2 repository isn't very nice. Therefore Maven 2 allow to change the repository location. To modify your repository location navigate to the directory

<pre class="prettyprint">[PATH_TO_YOUR_MAVEN_INSTALLATION]/conf

Open the file "settings.xml" and uncomment the XML-node "localRepository" and switch it to whereever you want. Make sure that the XML-node "localRepository" is defined outside the comment and save it. After changing the location of your repository you should modify the classpath variable in your eclipse IDE too. Don't forget that! I've spend hours to find out why eclipse doesn't find my dependencies although they were placed in my repository before I recognized that I forget to update my classpath variable in eclipse.

Hint: Do not use network drives for the Maven 2 repository

Do not place your Maven 2 repository on network drives. If you do so your eclipse will loose performance due to the network roundtrips between eclipse and the repository.

Hint: After adding new dependencies

After you added some new dependencies please run

mvn clean eclipse:clean eclipse:eclipse

to regenerate the project-specific eclipse resources. Click on your projects root directory in the "Project Explorer" in your eclipse IDE and hit "F5" to let eclipse reload all resources. Not until that eclipse will find your new dependencies inside of your Java classes.

Previous Maven 2 articles:
Maven 2 (Part 1): Setting up a simple Apache Maven 2 Project
Maven 2 (Part 2): Dependencies, properties and scopes

Maven 2 (Part 2): Dependencies, properties and scopes

Posted on June 16, 2009 by Phillip Steffensen Posted in Uncategorized

Welcome back to the second part of our tutorial-series on Maven 2. This part will focus on the pom.xml and the Maven 2 dependency management, Maven properties and dependency scopes. To get started let’s first set up a project similar to the project we used in the first part of this tutorial. Set up the project as described in the article Maven 2 (Part 1): Setting up a simple Apache Maven 2 Project and reopen the pom.xml.

Dependencies

Some dependencies often are needed to write your applications. Commonly we (developers, developers, developers,…) are using some open source libraries and frameworks (e.g. the spring application framework or apache commons-logging,…). Sometimes own libraries should be referenced by a java project. To solve this problem Maven delivers a very good dependency mechanism that manages the dependencies of your project transitivly.


If you run

mvn package

Maven will download all dependencies referenced by your pom.xml from the central Maven 2 repository automatically. On mvnrepository.com you are able to search over all dependencies provided by this repository. Sometimes dependencies you need are not provided by the default Maven repository. In this case you don’t have to despair. Maven allows to set up and define own repositories aswell. We’ll explain how to set up own repositories in a later part of this tutorial-series. Let’s first focus on dependencies and how to add them to your Maven 2 project by using apache commons-logging and the spring application framework.

We are going to start with apache commons-logging. To add apache commons-logging we add the following xml-nodes to our project:

<project xmlns="http://maven.apache.org/POM/4.0.0" [...] > 
  <modelVersion>4.0.0</modelVersion> 
  <groupId>com.unitedcoders</groupId> 
  <artifactId>MyExampleApp</artifactId> 
  <packaging>jar</packaging> 
  <version>0.0.1-SNAPSHOT</version> 
  <name>MyExampleApp</name> 
  <url>http://maven.apache.org</url> 

  <dependencies> 

    <!-- Apache commons-logging --> 
    <dependency> 
      <groupId>commons-logging</groupId> 
      <artifactId>commons-logging</artifactId> 
      <version>1.1.1</version> 
    </dependency> 

    <!-- Testing Dependencies --> 
    <dependency> 
      <groupId>junit</groupId> 
      <artifactId>junit</artifactId> 
      <version>4.5</version> 
      <scope>test</scope> 
    </dependency> 

  </dependencies> 

</project>

As you mentioned, we’ve done 2 changes. We changed the version number of junit to version 4.5 and we added a dependency node for commons-logging which contains groupId, artifactId and the version number. The groupId helps Maven to locate the dependency in the Maven repository. The artifactId defines which artifact of the specified group is needed (In our case “commons-logging”) and the “version”-node defines the exact version of the dependency. I think 1.1.1 is the latest version of commons-logging at this time. If you now run

mvn clean package

again you’ll see that Maven automatically downloads apache’s commons-logging-1.1.1.jar from http://repo*.maven.org. To get a little bit more confidant with this lets add some more dependencies by modifying our pom.xml like this:

<project xmlns="http://maven.apache.org/POM/4.0.0" [...] > 
  <modelVersion>4.0.0</modelVersion> 
  <groupId>com.unitedcoders</groupId> 
  <artifactId>MyExampleApp</artifactId> 
  <packaging>jar</packaging> 
  <version>0.0.1-SNAPSHOT</version> 
  <name>MyExampleApp</name> 
  <url>http://maven.apache.org</url> 

  <build> 
    <finalName>MyExampleApp</finalName> 
    <plugins> 
      <plugin> 
        <groupId>org.apache.maven.plugins</groupId> 
        <artifactId>maven-compiler-plugin</artifactId> 
        <pre class="prettyprint"> 
          <source>1.5</source> 
          <target>1.5</target> 
        </configuration> 
      </plugin> 
    </plugins> 
  </build> 

  <dependencies> 

    <!-- Spring Framework --> 
    <dependency> 
      <groupId>org.springframework</groupId> 
      <artifactId>spring-core</artifactId> 
      <version>2.5.5</version> 
    </dependency> 

    <dependency> 
      <groupId>org.springframework</groupId> 
      <artifactId>spring-webmvc</artifactId> 
      <version>2.5.5</version> 
    </dependency> 

    <dependency> 
      <groupId>org.springframework</groupId> 
      <artifactId>spring-web</artifactId> 
      <version>2.5.5</version> 
    </dependency> 

    <dependency> 
      <groupId>org.springframework</groupId> 
      <artifactId>spring-jdbc</artifactId> 
      <version>2.5.5</version> 
    </dependency> 
 
    <!-- Apache commons-logging --> 
    <dependency> 
      <groupId>commons-logging</groupId> 
      <artifactId>commons-logging</artifactId> 
      <version>1.1.1</version> 
    </dependency> 
 
    <!-- javax Servlet API --> 
    <dependency> 
      <groupId>javax.servlet</groupId> 
      <artifactId>servlet-api</artifactId> 
      <version>2.4</version> 
      <scope>provided</scope> 
    </dependency> 

    <dependency> 
      <groupId>javax.servlet</groupId> 
      <artifactId>jstl</artifactId> 
      <version>1.1.2</version> 
    </dependency> 
    <dependency> 
      <groupId>taglibs</groupId> 
      <artifactId>standard</artifactId> 
      <version>1.1.2</version> 
      <scope>runtime</scope> 
    </dependency> 
  
    <!-- Testing Dependencies --> 
    <dependency> 
      <groupId>junit</groupId> 
      <artifactId>junit</artifactId> 
      <version>4.5</version> 
      <scope>test</scope> 
    </dependency> 

    <!-- Database --> 
    <dependency> 
      <groupId>commons-dbcp</groupId> 
      <artifactId>commons-dbcp</artifactId> 
      <version>1.2.2</version> 
    </dependency> 

    <dependency> 
      <groupId>mysql</groupId> 
      <artifactId>mysql-connector-java</artifactId> 
      <version>5.1.6</version> 
    </dependency> 
  </dependencies> 

</project>

Please take a few minutes and take a look at all changes we made in pom.xml. After that let's discuss these changes starting at the top of pom.xml:

1. We added a build-node

The build-node contains build-specific configurations for our project. We inserted a "finalName" that defines how our jar-file will be named (in this case: MyExampleApp.jar) Also we added the „maven-compiler-plugin“ to be able to configure the Java version on which our project should be build.

2. Some more dependencies

And we added some more dependencies which we can use to set develop a web application for example.

Properties

Is there anything you don't like? Do you see anything that should be improved? Right. There are some redundant version numbers. That isn't very beautiful. We will now see a way to remove this redundancy: Properties. Take a look at the following example on how to use the maven property mechanism.

<project xmlns="http://maven.apache.org/POM/4.0.0" [...] > 
  <modelVersion>4.0.0</modelVersion> 
  <groupId>com.unitedcoders</groupId> 
  <artifactId>MyExampleApp</artifactId> 
  <packaging>jar</packaging> 
  <version>0.0.1-SNAPSHOT</version> 
  <name>MyExampleApp</name> 
  <url>http://maven.apache.org</url> 

  <build> 
     [....]
  </build> 

  <properties> 
    <spring-version>2.5.5</spring-version> 
    <junit-version>4.5</junit-version> 
    <pre class="prettyprint">1.2.2</commons-dbcp-version> 
    <pre class="prettyprint">1.1.1</commons-logging-version> 
  </properties> 

  <dependencies> 

    <!-- Spring Framework --> 
    <dependency> 
      <groupId>org.springframework</groupId> 
      <artifactId>spring-core</artifactId> 
      <version>${spring-version}</version> 
    </dependency> 

    <dependency> 
      <groupId>org.springframework</groupId> 
      <artifactId>spring-webmvc</artifactId> 
      <version>${spring-version}</version> 
    </dependency> 

    <dependency> 
      <groupId>org.springframework</groupId> 
      <artifactId>spring-web</artifactId> 
      <version>${spring-version}</version> 
    </dependency> 

    <dependency> 
      <groupId>org.springframework</groupId> 
      <artifactId>spring-jdbc</artifactId> 
      <version>${spring-version}</version> 
    </dependency> 
 
    <!-- Apache commons-logging --> 
    <dependency> 
      <groupId>commons-logging</groupId> 
      <artifactId>commons-logging</artifactId> 
      <version>${commons-logging-version}</version> 
    </dependency> 
 
    [....]
 
    <!-- Testing Dependencies --> 
    <dependency> 
      <groupId>junit</groupId> 
      <artifactId>junit</artifactId> 
      <version>${junit-version}</version> 
      <scope>test</scope> 
    </dependency> 

    <!-- Database --> 
    <dependency> 
      <groupId>commons-dbcp</groupId> 
      <artifactId>commons-dbcp</artifactId> 
      <version>${commons-dbcp-version}</version> 
    </dependency> 

    [....]
 
  </dependencies> 

</project>

We've added the "properties"-node that contains our properties as subnodes. You are able to use these properties everywhere in your pom.xml by surrounding them with ${ and }. Especially on the spring dependencies you can see that the version-number-redundancy is gone. The version number is now defined only once. This isn't the biggest advantage ever and it doesn't seems to be rocket science, but I think it is good to define all versions at one position. Feel free to try out some own properties on all other dependency definitions.

Scopes

As you noticed i added a "scope"-node to some of the project's dependencies. The dependency scope defines to which part of the project's lifecycle the dependency is attached. For example the JUnit dependency is only added while running the tests. So i defined that this dependency is attached to the target "test".

There are at least 6 different scopes:

  • compile: This scope indicates the Maven compilation phase. It is the default scope that will be used if no scope is defined.
  • provided: This scope defines that this dependency is provided by the container (e.g. Apache Tomcat) at runtime.
  • runtime: This scope indicates that this Dependency isn't required in the compilation phase but it is needed at runtime.
  • test: "test"-dependencies are only while compiling and running the tests.
  • system: This scope indicates that the dependency is provided by the system.
  • import: "import" is only used in the pom's <dependencyManagement> section. It indicates that the specified POM should be replaced with the dependencies in that POM's <dependencyManagement> section.

For more informations on Maven 2 and how to configure eclipse to import Maven 2 projects please stay tuned. See ya... Phil

Previous Maven 2 articles:
Maven 2 (Part 1): Setting up a simple Apache Maven 2 Project

Further Maven 2 articles:
Maven 2 (Part 3): Configuring eclipse for Apache Maven 2 projects

All about types in Javascript – Explicit type conversion

Posted on June 12, 2009 by Matthias Reuter Posted in Uncategorized

This is the third part of a series “All about types” in Javascript.

  1. The basics – types and general rules of conversion
  2. Automatic type conversion
  3. Explicit type conversion
  4. Type detection

Explicit type conversion

When I wrote about automatic type conversion I told my story of testing if “0” really converts to true, which I did by comparison to true. That was wrong, as I found out, but what is the right way? One possibility is

if ("0") {
  alert("true");
}
else {
  alert("false");
}

which is ugly. Furthermore, this technique can only be used to convert values to boolean. How do I explicitly convert, for example, a boolean to a number?

There are two ways to do so. One is a more formal apporach, the other is somewhat hackish, though it is used commonly.

To every primitive type (string, number and boolean) there is a corresponding object constructor (String, Number and Boolean). If you call any of these as a function instead of a constructor, they perform type conversion. Thus, to do explicit type conversion, you do this:

Number("42"); // returns 42
String(42);   // returns "42"
Boolean(42);  // returns true

The hackish way to do the same is this:

"42" - 0;  // convert to number
42 + "";   // convert to string
!!42;      // convert to boolean

As stated in the first part, the minus operator – converts all operands to numbers, therefore by subtracting 0, the string “42” is converted to the number 42. Alternatively you could multiply “42” by 1.

If one operand is a string, the plus operator converts the other to a string as well. So by concatenating one value to the empty string, that value is converted to a string.

The logical not operator ! converts its operand to a boolean. Unfortunately to the opposite value. If “foo” converts to true, !”foo” converts to false. Well, to convert it to the matching value simply add another not operator. !!”foo” then converts to true. Hooray!

What’s the use of this hackish type conversion? We have a nice way to do type conversion by explicitly using Number, String and Boolean, this is highly comprehensible, why on earth would you prefer something like !!”42″ to Boolean(“42″)?

Speed, that is. I did some performance tests to proof that, and yes, !!”42” is faster than Boolean(42). About 17 times faster in Chrome 2, about 5 times faster in IE 6, about 8 times faster in Firefox 3 and so on. Conversion to number is 1.25 (Chrome) to 3.5 times faster (Firefox), to String is about 2.5 times faster in Firefox, but slightly slower in Opera.

But! Let me repeat the mantra of code optimization: “No premature optimization”. Again: “No premature optimization”! The speed advantage for a single conversion is between 0.000116 milliseconds (Chrome, to boolean) and 0.000962 milliseconds (IE 6, to string). Both is less than one microsecond. That’s imperceptible.

So what about parseInt?

So, if you should use Number to convert a value to a number, what’s the use of parseInt and parseFloat? Don’t they do the same?

No, they don’t, not exactly. Number accepts parameters, that are string representations of decimal, octal or hexadecimal numbers.

Number("12");    // decimal, 12
Number("0x12");  // hexadecimal, 18
Number("012");   // octal, 10

While this is true for parseInt as well, parseInt goes further. parseInt accepts parameters that start with a string representation of a number.

parseInt("314abc");    // 314
Number("314abc");      // NaN

Beyond that, there is a slight difference in handling octal numbers. If the given string starts with “0” (but not with “0x”), it is interpreted as an octal number, no matter what follows. Thus

parseInt("09"); // 0

Since “09” starts with a 0, an octal number is assumed. The next character, “9” is not a valid octal numeric character. Remember, parseInt accepts values that start with a string representation. Since “09” is not a valid octal number, the longest substring “0” is taken.

Number behaves differently. Since “09” is not a valid octal number, a decimal number is assumed, and thus the leading “0” is ignored.

Number("09");  // 9

This problem often occurs when parsing user data. Imagine the user enters a date as a string in the format “2009-06-09”.

var userString = "2009-06-09";
var dateParts = userString.split("-"); // ["2009", "06", "09"]
var year = parseInt(dateParts[0]);     // 2009
var month = parseInt(dateParts[1]);    // 6
var day = parseInt(dateParts[2]);      // 0, bang!

There are two ways to prevent errors like that. The first is to use Number instead:

var userString = "2009-06-09";
var dateParts = userString.split("-"); // ["2009", "06", "09"]
var year = Number(dateParts[0]);       // 2009
var month = Number(dateParts[1]);      // 6
var day = Number(dateParts[2]);        // 9, hurray!

The second is to explicitly denote a base:

var userString = "2009-06-09";
var dateParts = userString.split("-");   // ["2009", "06", "09"]
var year = parseInt(dateParts[0], 10);   // 2009
var month = parseInt(dateParts[1], 10);  // 6
var day = parseInt(dateParts[2], 10);    // 9, hurray!

Personally, I prefer the second way. By the way, parseInt is the fastest way in Chrome and Firefox 3 to convert a string to number, while it’s the slowest in all other browsers I tested.

Summary

To explicitly convert one value to a differnt type, use Boolean, String and Number. You might use parseInt instead for converting user input, to only accept decimal numbers.

All about types in Javascript – Automatic type conversion

Posted on May 25, 2009 by Matthias Reuter Posted in Uncategorized

This is the second part of a series “All about types” in Javascript.

  1. The basics – types and general rules of conversion
  2. Automatic type conversion
  3. Explicit type conversion
  4. Type detection

Automatic type conversion

You have seen it before:

var element = document.getElementById("someId");
if (element) {
  // do something
}

That’s an automatic type conversion. The if-statement expects a boolean value, and if the given expression does not return one, the result is converted. document.getElementById either returns an object or null. Null is converted to false, any object to true. That’s why constructions as the above work.

In general, every time an operator or a statement expects a value of a certain type but a different type is given, automatic type conversion occurs.

There are several statements that require some special type and do type conversion when neccessary. That’s the if-statement, while and do-while and the for-statement. They all require booleans, so other types are converted to boolean.

The most common operators are the mathematical operators, such as + and -, and the comparison operator == (and their comrades +=, -= etc. and !=). I will cover the uglier operators first.

Automatic type conversion when dealing with the + operator

The + operator is an ugly element of Javascript. While it’s also overloaded in other languages, it’s also clear (and intuitive) what happens there. That’s different from Javascript. What’s the intuitive way to add a boolean to an object? Is there any way to define some semantics for adding a boolean to an object? I can’t think of one. Yet it can be done in Javascript.

When dealing with two numbers, the + operator adds these. When dealing with two strings, it concatenates.

3 + 5;              // 8
"hello " + "world"; // "hello world"
"3" + "5";          // "35"

When one operand is a string and the other is not, the other is converted to a string and the result is concatenated.

"the answer is " + 42; // "the answer is 42"
"this is " + true;     // "this is true"
var a;
"a is " + a;           // "a is undefined"

In any other case (except for Dates) the operands are converted to numbers and the results are added.

1 + true;     // -> 1 + 1 -> 2
null + false; // -> 0 + 0 -> 0

Dates for some reason are converted to strings, and thus the two operands are concatenated.

new Date() + 86400000; // that's not adding one day to date!
// that results in "Tue May 19 2009 10:46:30 GMT+020086400000" or something

The comparison operator ==

The first book on javascript that I read was ppk on JavaScript. There I read […] An empty String ” becomes false. All other values become true.. Of course I had to test this by myself and tried to convert the string “0” to a boolean. How do you know if this converts to true? Compare it! (No, actually not. Don’t repeat my mistake. Read the next part of this series about how to explicitly convert types.) So I did this:

"0" == true;  // don't do this

To my surprise this evaluated to false. I even wrote an email to ppk to point out that error in his book, which I find quite embarassing by now.

The reason behind that is the unexpected ruleset of type conversion using the comparison operator. In my case I expected it to convert the string “0” to a boolean, but it did not. Actually, the boolean was converted to a number. Yes, to a number. In the next step “0” was converted to a number as well. Therefore these were the steps taken:

"0" == true;  // first step: convert true to a number
"0" == 1;     // second step: convert "0" to a number
0 == 1;       // last step: compare
false;

That tought me a lot about javascript.

If you compare two values of the same type using the comparison operator no type conversion is done (although the results sometimes are surprising, I might cover this in an own article).

So it comes down to compare two values of different types. The first rule is: null and undefined are equal.

null == undefined; // true

The second rule is: when in doubt convert to number. If you compare a string and a number, convert the string to a number. If you compare a boolean to something else, convert the boolean to a number. On the other hand, if you compare an object to something else, convert it to a primitive by calling its valueOf method (and if that does not return a primitive call toString) and go on.

The Date object is different of course. Its valueOf normally returns a number, yet in combination with the comparison operator it returns a string representation.

This table shows it all. Note that the comparison operator is commutative, that means x == y and y == x return the same value.

Rules of conversion using x == y</caption>
  <thead>
    <tr>
      <th>type of x</th>
      <th>type of y</th>
      <th>result</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>null</td>
      <td>undefined</td>
      <td>true</td>
    </tr>
    <tr>
      <td>number</td>
      <td>string</td>
      <td>x == Number(y)</td>
    </tr>
    <tr>
      <td>boolean</td>
      <td>any, except for string</td>
      <td>Number(x) == y</td>
    </tr>
    <tr>
      <td>object</td>
      <td>string or number</td>
      <td>primitive representation of x == y</td>
    </tr>
    <tr>
      <td colspan="2">any other case</td>
      <td>false</td>
    </tr>
  </tbody>
</table>


<h3> The -, *, / and % operator </h3>

That's easy. When dealing with these operators convert the operands to number.

<pre class="prettyprint">
3 / "3";    // 1
"3" - true; // 2
null * "3"; // 0
"42" % "5"; // 2

The relational operator <

You can either compare strings or numbers. If you try to compare other types, convert the operands to numbers. Even Dates comply with that. Then you have two numbers to compare, and that's easy.

true < 2;                // 1 < 2, true
true < null;             // 1 < 0, false
"3" < 4;                    // 3 < 4, true
new Date() < 1234567890000; // false, if your clock is set correctly
{ valueOf : function () { return 3; }} < 4; // 3 < 4, true

The logical operators && and ||

The logical operators in javascript work on any type, unlike in Java. And unlike in Java, they do not return a boolean value. Still they expect boolean values and if not given two the operands are converted to boolean.

So technically that's a simple case. The implications, however, are immense. If you understand what happens you rose to a higher level of javascript development. If you like you may call yourself a ninja now.

Though the logical operators implicitly convert the operands to boolean (not neccessarily all, depending on the context), not the boolean value is returned but the operand itself.

var a = 0 || "4";
// a is now "4"

This comes in handy when supplying default values. Assume a function that returns the day of a year for a given date. If no date is supplied, the current date is taken:

function getDayOfYear = function (date) {
  date = date || new Date();
  var first = new Date(date.getFullYear(), 0, 1);
  return Math.floor((date - first) / 86400000) + 1;
}

In DOM scripting the logical or operator helps avoid browser incompabilities:

function doSomethingOnClick (event) {
  event = event || window.event;
  var target = event.target || event.srcElement;
}

In line 2 we look if the parameter event has been set (W3C event model) or not (Microsoft event model). In line 3 we get the element the action occured on according to the W3C model or the Microsoft model respectively.

You can use the logical and operator to do null checks before accessing object properties:

function setBackground (element, color) {
  if (element && element.style) {
    element.style.backgroundColor = color;
  }
}

Line 2 is a null check for both element and element.style.

Summary

Automatic type conversion happens in many cases. In most cases it is somewhat logical, in others you either need to know exactly what you're doing or you need to do an explicit type conversion. How to do this is covered in the upcoming third part of this series.

Tags

android code jam code puzzle hackercup hosting java javascript linux permutations project euler python raspberry pi server

Recent Posts

  • Raspberry Pi Supply Switch, start and shut down your Pi like your PC
  • Infinite House of Pancakes – code jam 2015
  • google code jam 2014 – magic trick
  • DIY: automated encrypted windows server backup
  • Generating random numbers and strings in Java
  • Prev
  • 1
  • …
  • 7
  • 8
  • 9
  • 10
  • Next

Copyright

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.
© united-coders

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.

Close