Hoffbauer

Ideen,Fundstücke im Netz, Gedanken

07 November 2005

 

Geolocation by IP Address...

Determininggeographic locations based on Internet IP offers localization servicesand brings together user communities without the need for GPS receiversor complicated configuration switching.

The Internet has become a collection of resources meant to appeal to a large general audience. Althoughthis multitude of information has been a great boon, it also has diluted the importance of geographicallylocalized information. Offering the ability for Internet users to garner information based on geographiclocation can decrease search times and increase visibility of local establishments. Similarly, usercommunities and chat-rooms can be enhanced through knowing the locations (and therefore, local times, weatherconditions and news events) of their members as they roam the globe. It is possible to provide user servicesin applications and Web sites without the need for users to carry GPS receivers or even to know where theythemselves are.

Geolocation by IP address is the technique of determining a user's geographic latitude, longitude and, byinference, city, region and nation by comparing the user's public Internet IP address with known locations ofother electronically neighboring servers and routers. This article presents some of the reasons for andbenefits of using geolocation through IP address, as well as several techniques for applying this technologyto an application, Web site or user community.

Why Geolocation?

The benefits of geolocation may sound complex, but a simple example may help illustrate the possibilities.Consider a traveling businessman currently on the road to San Francisco. After checking into his hotel, hepulls out his laptop and hops onto the wireless Internet access point provided by the hotel. He opens hischat program as well as a Web browser. His friends and family see from his chat profile that he currently isnear Golden Gate Park. Consequently, they can determine his local time. By pulling up a Web browser,furthermore, the businessman can do a localized search to find nearby restaurants and theaters.

Without having to know the address of the hotel he's staying in, the chat program and Web pages candetermine his location based on the Internet address through which he is connecting. The following week, whenhe has returned to his home in Florida, he uses his laptop to log into a chat program, and his chat profilecorrectly places him in his home city. There is no need to change computer configurations, remember addressesor even be aware, as the user, that you are benefitting from geolocation services.

Possible applications for geolocation by IP address exist for Weblogs, chat programs, user communities,forums, distributed computing environments, security, urban mapping and network robustness. We encourage youto find out what applications and Web sites currently employ geolocation or could be enhanced by addingsupport.

Although several methods of geographically locating an individual currently exist, each system has costand other detriments that make them technology prohibitive in computing environments. GPS is limited byline-of-sight to the constellation of satellites in Earth's orbit, which severely limits locating systems incities, due to high buildings, and indoors, due to complete overhead blockage. Several projects have beenstarted to install sensors or to use broadcast television signals (see Resources) to provide for urban andindoor geolocation. Unfortunately, these solutions require much money to cover installation of newinfrastructure and devices, and these services are not supported widely yet.

By contrast, these environments already are witnessing a growing trend of installing wireless accesspoints (AP). Airports, cafes, offices and city neighborhoods all have begun installing wireless APs toprovide Internet access to wireless devices. Using this available and symbiotic infrastructure, geolocationby IP address can be implemented immediately.

Geolocation Standards and Services

As discussed below, several RFC proposals have been made by the Internet Engineering Task Force (IETF)that aim to provide geolocation resources and infrastructure. However, these standards have met with littlesupport from users and administrators. To date, there has not been much interest in providing user locationtracking and automatic localization services. Several companies now offer pay-per-use services fordetermining location by IP. These services can be expensive, however, and don't necessarily offer the kind offunctionality a programmer may want when designing his or her Web site or application.

Several years ago, CAIDA, the Cooperative Association for Internet Data Analysis, began a geolocation byIP address effort called NetGeo. This system was a publicly accessible database of geographically located IPaddresses. Through the use of many complex rules, the NetGeo database slowly filled and was corrected for thelocation of IP addresses. The project has been stopped, however, and the technology was licensed to newpartners. However, the database still is available, although several years old, and provides a good resourcefor determining rough locations.

To query the NetGeo database, an HTTP request is made with the query IP address, like this:

--$ http://netgeo.caida.org/perl/netgeo.cgi?target=192.168.0.1 VERSION=1.0TARGET: 192.168.0.1NAME: IANA-CBLK1NUMBER: 192.168.0.0 - 192.168.255.255CITY: MARINA DEL REYSTATE: CALIFORNIACOUNTRY: USLAT: 33.98LONG: -118.45LAT_LONG_GRAN: CityLAST_UPDATED: 16-May-2001NIC: ARINLOOKUP_TYPE: Block AllocationRATING: DOMAIN_GUESS: iana.orgSTATUS: OK--

As you can see, the NetGeo response includes the city, state, country, latitude and longitude of the IPaddress in question. Furthermore, the granularity (LAT_LONG_GRAN) also is estimated to give some idea aboutthe accuracy of the location. This accuracy also can be deduced from the LAST_UPDATED field. Obviously, theolder the update, the more likely it is that the location has changed. This is true especially for IPaddresses assigned to residential customers, as companies holding these addresses are in constant flux.

In order to make this database useful to an application or Web site, we need to be able to make therequest through some programming interface. Several existing packages assist in retrieving information fromthe NetGeo database. The PEAR system has a PHP package (see Resources), and a PERL module,CAIDA::NetGeo::Client, is available. However, it is a relatively straightforward task to make a request inwhatever language you are using for your application or service. For example, a function in PHP for gettingand parsing the NetGeo response looks like this:

--1: function getLocationCaidaNetGeo($ip)2: { 3: $NetGeoURL = "http://netgeo.caida.org/perl/netgeo.cgi?target=".$ip; 4:  5: if($NetGeoFP = fopen($NetGeoURL,r))6: { 7:         ob_start();8: 9:         fpassthru($NetGeoFP);10:         $NetGeoHTML = ob_get_contents();11:         ob_end_clean();12:13: fclose($NetGeoFP);14: }15: preg_match ("/LAT:(.*)/i", $NetGeoHTML, $temp) or die("Could not find element LAT");16: $location[0] = $temp[1];17: preg_match ("/LONG:(.*)/i", $NetGeoHTML, $temp) or die("Could not find element LONG");18: $location[1] = $temp[1];19:20: return $location;21: }--

Using DNS to Your Advantage

As previously mentioned, the NetGeo database slowly is becoming more inaccurate as IP address blockschange hands in company close-outs and absorptions. Several other tools are available for determininglocation, however. A description of the NetGeo infrastructure itself (see Resources) presents some of themethods it employed for mapping IP addresses and can be a source of guidance for future projects.

One of the most useful geolocation resources is DNS LOC information, but it is difficult to enforce acrossthe Internet infrastructure. RFC 1876 is the standard that outlines "A Means for Expressing LocationInformation in the Domain Name System." Specifically, this is done by placing the location information of aserver on the DNS registration page. Several popular servers have employed this standard but not enough to bedirectly useful as of yet.

To check the LOC DNS information of a server, you need to get the LOC type of the host:

--$ host -t LOC yahoo.comyahoo.com LOC 37 23 30.900 N 121 59 19.000 W 7.00m 100m 100m 2m--

This parses out to 37 degrees 23' 30.900'' North Latitude by 121 degrees 59' 19.000'' West Longitude at 7meters in altitude, with an approximate size of 100 meters at 100 meters horizontal precision and 2 metersvertical precision. There are several benefits to servers that offer their geographic location in this way.First, if you are connecting from a server that shows its DNS LOC information, determining your geolocationis simple, and applications may use this information without further work, although some verification may beuseful. Second, if you are connecting on your second or third bounce through a server that has DNS LOCinformation, it may be possible to make an estimate of your location based on traffic and ping times.However, it should be obvious that these estimates greatly degrade accuracy.

It also is possible to put the DNS LOC information for your Web site in its registration (see Resources).If more servers come to use LOC information, geolocation accuracy will be much easier to attain.

Sidebar: host

host is a DNS lookup utility that allows users to find out various pieces of information about a host. Thesimplest use is doing hostname to IP address lookups and the reverse. The reverse, dotted-decimal IPv4notation, is used for this, and the actual server that hosts the canonical name is returned. The type flag,-t, can be used to obtain specific information from the host record from the name server.

Where There's a Name, There's a Way

Many users hopping onto the Internet probably aren't coming from a major server. In fact, most users don'thave a static IP address. Dial-up, cable modems and cell phone connections are assigned a dynamic IP addressthat may change multiple times in one day or not at all for several weeks. Therefore, it becomes difficult totie these dynamic addresses to a single location.

To our rescue, these service providers typically provide an internal naming scheme for assigning IPaddresses and associating names with these addresses. Typically, the canonical name of an IP address containsthe country-code top-level domain (ccTLDs) in a suffix. CN is China, FR is France, RO is Romania and so on.Furthermore, the name even may contain the city or region in which the IP address is located. Often, however,this information is shortened to some name that requires a heuristic to determine. For example, in yourservice or application, a user may appear to be coming from d14-69-1-64.try.wideopenwest.com. Awhois at this address reveals it is a WideOpenWest account from Michigan. Using some logic,it is possible to deduce that this user is connecting through a server located in Troy, MI, hence the .try.in the canonical name.

Some projects have been started to decipher these addresses (see Resources), and you also can get all ofthe country codes and associated cities and regions of a country from the IANA Root-Zone Whois Information orthe US Geospatial Intelligence Agency, which hosts the GEOnet Names Server (GNS). The GNS has freelyavailable data files on almost all world countries, regions, states and cities, including their sizes,geographic locations and abbreviations, as well as other information.

Information such as that presented on the GNS also can be used to provide users with utilities andservices specific to their geographical locations. For example, it is possible to determine a user's localcurrency, time zone and language. Time zone is especially useful for members of a community or chat group todetermine when another friend may be available and on-line.

Where Are You Located?

Now that we've explained some of the techniques that can be used in geolocating Internet users by their IPaddresses, we offer you a chance to try it out. Point your Web browser of choice here, and see how accurate orinaccurate the current results are. Please leave comments below about the accuracy of your results as well asany ideas you may have.

 
Powered By Qumana

Comments: Kommentar veröffentlichen



<< Home

Archives

Oktober 2005   November 2005   Dezember 2005   Januar 2006   Februar 2006   Juni 2006   Juli 2006   Oktober 2006   November 2006   Februar 2007   November 2007   Januar 2010   September 2010  

This page is powered by Blogger. Isn't yours?