Read an Excerpt
Network Performance Open Source Toolkit
Using Netperf, tcptrace, NISTnet, and SSFNet
By Richard Blum
John Wiley & Sons
Copyright © 2003
All right reserved.
Defining Network Performance
Before you dive into detailed discussions of network performance tools, it is a
good idea to first understand what network performance is, and how it can be
measured. This chapter defines network performance, describes the elements
involved in measuring it, and shows techniques many network performance
tools use to measure it
The words that network administrators hate to hear are "The network seems
slow today." What exactly is a slow network, and how can you tell? Who determines
when the network is slow, and how do they do it? There are usually more
questions than answers when you're dealing with network performance in a
production network environment.
It would be great if there were standard answers to these questions, along
with a standard way to solve slow network performance. The open source network
performance tools presented in this book can help the network administrator
determine the status of the network, and identify the areas of the network
that could be improved to increase performance. Often,network bottlenecks
can be found, and simply reallocating the resources on a network can
greatly improve performance, without the addition of expensive new network
Knowing the elements of network performance will help you better understand
how the network performance tools work, and how to interpret the vast
amount of information the tools provide. The first section of this chapter
describes the elements involved in determining network performance.
The Elements of Network Performance
Much work has been devoted to the attempt to define network performance
exactly. It is not the intention of this book to bore you with numerous equations
that describe theoretical network philosophy about how packets traverse
networks. Network performance is a complex issue, with lots of independent
variables that affect how clients access servers across a network. However, most
of the elements involved in the performance of networks can be boiled down to
a few simple network principles that can be measured, monitored, and controlled
by the network administrator with simple-often free-software.
Most network performance tools use a combination of five separate elements
to measure network performance:
* Response time
* Network utilization
* Network throughput
* Network bandwidth capacity
This section describes each of these elements, and explains how network
performance tools use each element to measure network performance.
The first step in measuring network performance is to determine if the network
is even working. If traffic cannot traverse the network, you have bigger
problems than just network performance issues. The simplest test for network
availability is the ping program. By attempting to ping remote servers from
a client device on the network, you can easily determine the state of your
Just about all Unix implementations include the ping program to query
remote hosts for availability. The ping program sends an Internet Control Message
Protocol (ICMP) echo request packet to the destination host. When the
echo request packet is received, the remote host immediately returns an echo
reply packet to the sending device.
While most network administrators know what the ping program is, few
know that there are lots of fancy options that can be used to perform advanced
testing using the ping program. The format of the ping command is:
ping [-dfnqrvR] [-c count] [-i wait] [-l preload] [-p pattern] [-s
You can use different combinations of options and parameters to create the
ping test that best suits your network environment. Often, just using the
default options and parameters provides enough information about a network
link to satisfy availability questions.
Receiving an echo reply packet from the remote host means that there is an
available network path between the client and server devices. If no echo reply
packet is received, there is a problem with either a network device or a link
along the path (assuming the remote server is available and answering pings).
By selecting different remote hosts on the network, you can determine if all
of the segments on your network are available for traffic. If multiple hosts do
not respond to a ping request, a common network device is most likely down.
Determining the faulty network device takes some detective work on your
While sending a single ping packet to a remote host can determine the availability
of a network path, performing a single ping by itself is not a good indicator
of network performance. You often need to gather more information to
determine the performance of any connections between the client and the
server. Abetter way to determine basic network performance is to send a string
of multiple ping request packets.
Using Availability Statistics
When multiple ping packets are sent to a remote host, the ping program tracks
how many responses are received. The result is displayed as the percentage of
the packets that were not received. A network performance tool can use the
ping statistics to obtain basic information regarding the status of the network
between the two endpoints.
By default the Unix ping program continually sends ping requests to the
designated remote host until the operator stops the operation by pressing a
Ctrl-C key combination. Alternately, you can use the -c option in the ping command
to specify a specific number of ping requests to send. Each ping request
is tracked separately using the ICMP sequence field.
A sample ping session that uses multiple ping packets looks like this:
$ ping 192.168.1.100
PING 192.168.1.100 (192.168.1.100): 56 data bytes
64 bytes from 192.168.1.100: icmp_seq=0 ttl=255 time=0.712 ms
64 bytes from 192.168.1.100: icmp_seq=1 ttl=255 time=0.620 ms
64 bytes from 192.168.1.100: icmp_seq=2 ttl=255 time=0.698 ms
64 bytes from 192.168.1.100: icmp_seq=3 ttl=255 time=0.662 ms
64 bytes from 192.168.1.100: icmp_seq=4 ttl=255 time=0.649 ms
- 192.168.1.100 ping statistics -
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.620/0.668/0.712/0.033 ms
In this example, a response was received for all of the packets that were sent,
indicating no problems on the network. If any of the ping packets do not solicit
a response, it can be assumed that either the echo request packet did not make
it to the remote server, or the remote server's echo response packet did not
make it back to the pinging client. In either case, something on the network
caused a packet to be lost.
Once you establish that there are lost packets in the ping sequence, you must
determine what caused the packet losses. The two biggest causes of lost packets
* Collisions on a network segment
* Packets dropped by a network device
Within an Ethernet segment, only one station is allowed to transmit at a
time. When more than one station attempts to transmit at the same time, a collision
occurs. Having collisions is normal for an Ethernet network, and not
something that should cause panic for the network administrator.
However, as an Ethernet segment gets overloaded, excessive collisions will
begin to take over the network. As more traffic is generated on the network,
more collisions occur. For each collision, the affected senders must retransmit
the packets that caused the collision. As more packets are retransmitted, more
network traffic is generated, and more collisions can occur. This event is called
a collision storm, and can severely affect the performance of a network segment.
Dropped packets can also result in packet losses. All network devices contain
packet buffers. As packets are received from the network, they are placed
in a packet buffer, waiting for their turn to be transmitted. This is demonstrated
in Figure 1.1.
Each port on a router or switch device contains an individual buffer area
that accepts packets destined to go out the interface. If excessive network traffic
occurs, preventing the timely emptying of the buffer, or if more packets
arrive than the port can transmit, the buffer will fill up.
If a network device's packet buffer gets filled up, it has no choice but to drop
incoming packets. This scenario happens frequently on network devices that
connect to networks running at different speeds, such as a 10/100 switch or
router. If lots of traffic arrives on a high-speed 100-MB connection destined for
a lower-speed 10-MB connection, packets will be backed up in the buffers, and
often overflow, causing dropped packets and retransmissions from the sending
To minimize this effect, most network devices are configured to allocate
ample memory space for handling packet buffers. However, it is impossible to
predict all network conditions, and dropped packets still may occur.
Using Large Ping Packets
Another problem with measuring availability is the size of the packets used in
the ping request. Many network devices handle packets with multiple packet
buffers, based on average packet sizes. Different buffer pools handle different-sized
packets. Too many of one particular size of packet can cause dropped
packets for that size category, while packets of other sizes are passed without
For example, switches often have three classes of packet buffers-one for
small packets, one for medium-sized packets, and one for large packets. To
accurately test these network devices, you must be able to send different-sized
packets to test the different packet buffers.
To accommodate this, most network performance tools allow you to alter
the size of the packets used in the testing. When testing networks that utilize
routers or switches, you must ensure that a wide variety of packet sizes are
used to traverse the network.
TIP There have been many instances of security problems with large ping
packets. As a result, most Unix systems only allow the root account to send
large ping packets. You should be careful when sending larger packets to
remote servers, so as to not adversely affect the remote server.
By default, the packet size used in the ping utility is 64 bytes (56 bytes of
data and the 8-byte ICMP header). You can use the -s option to change the
packet size, up to the maximum that is allowed on the network segment (1,500
for Ethernet networks).
After altering the packet size of the ping packets, you can see how this
affects the ping statistics by observing the output from the ping command:
#ping -s 1000 192.168.1.100
PING 192.168.1.100 (192.168.1.100):1000 data bytes
1008 bytes from 192.168.1.100: icmp_seq=0 ttl=127 time=2.994 ms
1008 bytes from 192.168.1.100: icmp_seq=1 ttl=127 time=2.952 ms
1008 bytes from 192.168.1.100: icmp_seq=2 ttl=127 time=2.975 ms
1008 bytes from 192.168.1.100: icmp_seq=3 ttl=127 time=2.940 ms
1008 bytes from 192.168.1.100: icmp_seq=4 ttl=127 time=3.133 ms
1008 bytes from 192.168.1.100: icmp_seq=5 ttl=127 time=2.960 ms
1008 bytes from 192.168.1.100: icmp_seq=6 ttl=127 time=2.988 ms
- 192.168.1.100 ping statistics -
7 packets transmitted, 7 packets received, 0% packet loss
round-trip min/avg/max/stddev = 2.940/2.992/3.133/0.060 ms
In this example, all of the large ping packets were still successful, indicating
that all of the segments between the host and the client were processing the
larger packets without any problems. If you experience packet loss with larger
packets, but not with smaller packets, this often indicates a problem with a
router or switch buffer somewhere in the network. Most router and switch
devices allow the administrator to change the packet buffer allocations to allot
more buffers for a particular packet-size range.
As seen in the ping example, while network availability is one element of network
performance, it cannot accurately reflect the overall performance of the
network. The network customers' perception of the network is not limited to
whether or not they can get to an individual server. It also includes how long
it takes to process data with the server.
To obtain a more accurate picture of the network performance, you must
observe how long it takes packets to traverse the network. The time that it
takes a packet to travel between two points on the network is called the
The response time affects how quickly network applications appear to be
working. Slow response times are often magnified by network applications
that need to send and receive lots of information across the network, or applications
that produce immediate results from a customer entry. Applications
such as TELNET, which require the customer to wait for a keystroke to be
echoed from the remote host, are extremely vulnerable to slow network
While network response time is often obvious to customers, trying to measure
the response time between two separate hosts can be a difficult thing to
do. Determining the time it takes for a packet to leave one network device and
arrive at a remote network device is not easy. There must be some mechanism
to time the leaving and arriving events, independent of the two systems on the
When using network performance tools that utilize round-trip response
times, it is always wise to incorporate the remote system's CPU utilization in
the data taken, to ensure that you are comparing response times run at similar
system loads, eliminating the system-loading factor.
In large networks, there are many factors that can affect response times
between a client and a server. As the network administrator, you can control
some of these factors, but others are completely out of your control. These factors
* Overloaded network segments
* Network errors
* Faulty network wiring
* Broadcast storms
* Faulty network devices
* Overloaded network hosts
Any one or combination of these factors can contribute to slow network
response time. Measuring the individual factors can be difficult, but the network
performance tools presented in this book can measure the overall effect
each factor has on network response times by sending known network traffic
samples and determining how the data traverses the network
Determining Response Time from Ping Packets
As seen in the sample outputs for the ping program, the round-trip response
time values for each ping packet sent are shown in the ping packet statistics:
64 bytes from 192.168.1.100: icmp_seq=0 ttl=255 time=0.712 ms
The response time is shown in milliseconds. For internal LAN connections,
the response times should be well within 1 or 2 milliseconds.
Excerpted from Network Performance Open Source Toolkit
by Richard Blum
Copyright © 2003 by Richard Blum.
Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.