 |




|
  |
  |
Mapping Where the Data Flows
By Martin Dodge
m.dodge@ucl.ac.uk
When I type the Internet Societys Web address into my browser,
the html and graphics are seamlessly downloaded and the page displayed
within a couple of seconds. But how does this data actually get
from the ISOC server to my PC in a basement office in central
London? How does it flow through the Internet to reach me? You
can answer this with traceroute, a useful tool that allows you
to lift the lid on the Internet and get a packets-eye view of
the network. If youre like meintrigued as to how the Internet
works beyond your browser and the telephone jack in the wallthen
traceroute can be a fun tool to use to explore and map the Internet.1
A traceroute utility maps the path that data packets take between
two points on the Internet, showing all of the intermediate nodes
traversed, along with an indication of the speed of travel. Traceroute
was invented in 1988 by Van Jacobson at Lawrence Berkeley National
Laboratory in the U.S. Today a traceroute utility often comes
as part of the operating system. Windows, for example, has a small
utility called tracert, which is used by typing, at the MS-DOS
prompt, tracert <Internet URL, e.g., www.yahoo.com>.
To illustrate how traceroute can map the Internet, I used it to
analyze the path from my PC in London to www.isoc.org. The program
works away for a few seconds as it dynamically explores the pathway
the data takes, moving one node at a time. The end result is the
following rather cryptic-looking output:

This might look meaningless to those unfamiliar with traceroute,
but it is in fact a kind of one-dimensional map of how the data
flows, with each node listed on a separate line. The map gives
valuable information on the real-time routing of data packets
through the Internet between London and Reston, Virginia, in the
U.S.the apparent location of the ISOC Web server. Each row of
the chart contains a node number followed by three time measurements
in millisecondssuch as 20ms 30ms <10ms. These are three separate
measurements of the time it took the data packet to travel from
the origin computer to that particular node and back. This is
called the round-trip time (RTT) and gives an indication of the
speed of each link. Finally, each row identifies a node as a domain
name and numeric IP address.
The traceroute output is comparable to a railway timetable that
charts (1) how trains travel through the network, (2) which stations
they visit, and (3) when they arrive. The traceroute timetable
above shows that data travelling from London to ISOC had to pass
through 16 intermediate stations, or network routers, to reach
the end of the line. To decode the traceroute chart further, read
it line by line to see each section of the route through the network.
In our trace we begin at node one, which is the departmental router
for my office, known by the somewhat cryptic domain name cisco-2.bart.ucl.ac.uk. The data packets move rapidly onward to the next node, which
has only a numeric address and is likely to be an anonymous but
important link somewhere in a server room in the university. At
node number three the data leaves my universitys internal network
and joins the London metropolitan-area network http://www.lonman.net.uk/Images/map99.gif which provides a fast backbone for universities and colleges
in London. From there it moves on to the JANET backbone, the U.K.s
academic and research network http://www.ja.net/ at a gateway machine called south-east-gw.ja.net. Node five is the gateway router to the transatlantic link for
the JANET network, which connects to ny-pop.janet, its point of presence (pop) in New York. At this point the data
packets have crossed the Atlantic, and consequently, you can see
a marked jump in the round-trip travel times from under 10 milliseconds
to 70 milliseconds caused largely by the 3,000-mile distance between
London and New York. The traffic then flows into the Teleglobe
backbone network http://www.teleglobe.com/ in New York at nodes seven and eight before passing to AlterNetpart
of UUNETs backbone empire, http://www.alter.net/lang.en/network/ also in New York.
Notice the strange, long domain names of these routers at the
core of the Internet. These names often hint at the city where
the node is located. They could contain the full name of the citysuch
as newyorkor just an abbreviation such as nycwhich can require some educated guesswork to decode. Fortunately
for traceroute explorers, many of the large backbone operators
use similar naming conventions for their network infrastructures.
From AlterNets network in New York, packets flow through two
more nodes before dropping down to Washington, D.C., at node 12.
Then it is on to two nodes in TCO, according to their domain name,
which is most likely to be Tysons Corner, Virginia. We are nearing
our target. At node 15 we have reached ISOCs network, and node
16 is the end of the linethe Internet Society's Web server. The
ISOC home page took 16 hops, across four different networks, to
get to my browserquite a feat of routing and cooperation, but
all in a days work for the Internet. This happens, unseen, for
the millions of Web surfers who need not worry about where the
data flows.
Triangulating the Internet with Web-Based Traceroutes
Conventional traceroute utilities are limited in one important
respect: the origin point of the exploration is fixed to the location
of the PC running the trace. To overcome this limitation, you
can use Web-based traceroutes that allow you to run a trace from
different starting points. These allow you to explore the Internets
topology from multiple locationsa kind of virtual triangulation.
I ran a traceroute from Canberra, Australia, to ISOC by using
a Web-based traceroute publicly provided by Telstra, a major Australian
telecom company and Internet backbone operator: http://www.telstra.com.au/. The output trace timetable is as follows:

Note that the output from this version of traceroute is in a slightly
different format from the previous version.
The traceroute utility is installed on a Telstra server that is
quite likely located in Canberra, given its domain nameCanberra.telstra.netso
this is where our data packets begin their journey to ISOC. The
next two nodes in the trace are also within Canberra, according
to their domain names. At node four the data moves a couple of
hundred miles from Canberra to Sydney.
The big trans-Pacific hop occurs at node six. There is certainly
a marked jump in the RTT at this point in our journey, caused
by the 7,500-mile distance across the Pacific Ocean. There is
no domain name for node six to give us a clue to its location,
but it is likely to be in California.
At node eight, the data joins the AboveNet network: http://www.above.net/network/network.html. The SFO domain name means it is probably in San Francisco. After
another node in SFO, the data moves to a node in SCL, which is
probably Santa Clara, California. The next major element in the
journey is the hop across the continental U.S. from California
to the Washington, D.C., region, at node 11. This long distance
is matched by another significant increase in the RTT. Nodes 1113
have no domains to tell us which networks they are on, but at
node 14 the data has arrived at the ISOC network and Web server.
To run a trace from Canberra yourself, go to http://www.telstra.net/cgi-bin/trace. There are several hundred freely available Web traceroute servers
in many different countries and cities across the world. Thomas
Kernen maintains a good list of them at http://www.traceroute.org/.
Add a Bit of Geography
An obvious refinement of the basic traceroute is to show on a
map the route the data takes visually. This is known as a geographical
traceroute. A number of these applications (see sidebox) attempt,
with varying degrees of success, to map the physical location
of Internet nodes traversed in a trace. I used two of the bestNeoTrace
and VisualRouteto run the first trace example from London to
ISOC to see how they performed when actually mapping the route.
Both of them are easy-to-use, affordable application utilities.
NeoTrace
NeoTrace is a geographical traceroute utility developed by NeoWorx.
It provides four different views of the trace in tabbed panels:
a geographic map, a nodal graph layout, a conventional listing,
and a line graph of RTT performance. A trial of NeoTrace for Windows
95/98/NT can be downloaded from http://www.neotrace.com for free; the full program costs US$ 29.95. The figure shows
the map view result of our trace; it has done a good job, successfully
locating and mapping 8 of 16 nodes, including the target in Reston.

VisualRoute
The second geographic traceroute application is Datametrics Systems
Corporations VisualRoute. The application interface includes
both a zoomable world map and a detailed listing of the trace.
A trial version of VisualRoute can be downloaded for free from
http://www.visualroute.com, and the full program costs US$ 29.95; it is available for Windows
95/98/NT, Linux and Solaris. VisualRoute also provides a Web server
version, with examples currently located in England; the Netherlands;
Canada; Freemont, California; and Fairfax, Virginia http://www.visualroute.com/server.html. In our trace to ISOC, VisualRoute found and plotted the location
of six nodes with confidence and made educated guesses at five
others, including the target. The results are shown in the figure
below.

Although geographic traceroutes are useful, their mapping capabilities
currently are limited by the serious difficulty in mapping an
Internet node to an actual latitude and longitude. There is no
automatic way to match up these virtual and real-world addresses.
This is true even in the U.S., where one can at least match backbone
nodes to particular cities somewhat reliably. Consequently, geographic
traceroutes use a variety of heuristics to try to resolve a network
node to a geographic location with various levels of accuracy.2 This is a difficult problem to crack, and traceroutes largely
depend on looking up Internet addresses in static databases of
latitude and longitude. These databases, however, are only partial
and cannot keep pace with the Internets constant growth and change.
They seem especially weak outside the U.S.
NeoTrace and VisualRoute offer a partial solution, allowing you
to add your own nodes and locations to their databases, along
with more detailed maps. A more effective solution is to add geographic
location information to domain names, as set out in a DNS-LOC
proposal that can be dynamically queried, but this has not been
implemented widely.3 An alternative might be the NetGeo, developed by the Cooperative
Association for Internet Data Analysis (CAIDA), a service for
mapping network entitiessuch as IP addresses, domain names, and
ASesto geographical locations.4
Conclusions
Why is traceroute useful? First, traceroute is an important Internet
debugging tool for those involved in keeping networks running.
It can help identify routing problems quickly and simply. It can
also be useful in tracking down the source of spam e-mail,5 as well as in trying to find a Web sites true location before
giving it ones credit card details. Many Web sites using country-level
domains are not actually hosted in the nation they indicate.
On another level, traceroute can help satisfy those who are curious
to know how their computers connect to the Internet and how they
can access information from all around the world as if it were
just next door. Traceroute reveals the hidden complexity of datas
path to a given destinationsometimes across 10 or 20 nodes or
more, perhaps owned and operated by competing companies. Looking
at what is happening in real time on the Internet always gives
me a small sense of wonder that this system works so well, enabling
tens of millions of people to communicate daily. So if you have
a little time, go and trace the route to your favorite Web site
and uncover the hidden complexity of the Internet that lies beyond
the browser window. You may be surprised by where the data flows.
References
1 The following articles provide good background information on
using traceroute to explore the Internet:
J. Rickard, "Mapping the Internet with Traceroute," Boardwatch magazine, December 1996.
http://boardwatch.internet.com/mag/96/dec/bwm38.html.
J. Carl, "Nailing Down Your Backbone: The Imprecise Art of Tracerouting,"
Boardwatch ISP Directory, Summer 1999.
http://boardwatch.internet.com/isp/summer99/tracerouting.html.
S. Dumett, "Tracing Your Route Through The Net," PreText magazine, March 1998.
http://www.pretext.com/mar98/columns/posttext.htm.
Unix Man page for Traceroute. http://www.zytek.com/traceroute.man.html.
2 For a nice summary of various means of determining an Internet
hosts geographic location, see "Finding a Host's Geographical
Location," by Uri Raz. http://www.private.org.il /IP2geo.html.
3 "Geo-enabling the Domain Name System." http://www.ckdhr.com/dns-loc/ .
4 "NetGeothe Internet Geographic Database." CAIDA. http://www.caida.org/Tools/N etGeo/.
5 B. Mattocks, "Reading Email Headers." http://www.blighty.com/spam/bill.html.
About the Author: Martin Dodge
|
 |