Opened 11 years ago

Closed 9 years ago

#41 closed enhancement (fixed)

Improve packet timing method

Reported by: aturner Owned by: aturner
Priority: high Milestone: 3.3
Component: tcpreplay Version: 3.0.beta7
Keywords: Cc:
Operating System: Add to FAQ?: no
Hardware: All
Output of tcpreplay -V:

Description (last modified by aturner)

Create another timing method to replace gettimeofday() and nanosleep() which aren't very granular fast on many platforms.

So I've determined the biggest problem: 1usec resolution is not enough. Some simple math shows that @ 125,000pps, the interval is 8usec. But at 130,000pps, we're looking at 7.7usec. Integer math converts that to 7usec and we end up doing around 141,000pps.

The answer seems to be that use floating point math and then for every 1/10 of a usec above the integer value, you sleep for an extra 1usec. Using the 130Kpps example of 7.7usec, I'd sleep for 7usec 3 times and 8usec 7 times, thus averaging 7.7usec. This seems to be pretty simple for calculating pps, but I still need to come up with an algorithm for more dynamic timing models (Mbps and multipliers). Rounding might be the easiest/best I can do.

Change History (33)

comment:1 Changed 11 years ago by aturner

  • Description modified (diff)
  • Priority changed from medium to high

comment:2 Changed 10 years ago by aturner

  • Description modified (diff)

comment:3 Changed 10 years ago by aturner

  • Add to FAQ? unset
  • Description modified (diff)
  • Milestone changed from Future Release to 3.1
  • Summary changed from Improve usability for performance testing to replace gettimeofday() and nanosleep() with a calculated loop

comment:4 Changed 10 years ago by aturner

  • Status changed from new to assigned

A much simpler and accurate method is to read the RDTSC (ReaD Time Stamp Counter) counter for Pentiums:

Of course, this isn't the most cross-platform solution, but PPC/UltraSparc may have similar solutions and they're less popular then x86. Anyways, this looks like the way to go for now.

comment:5 Changed 10 years ago by aturner

  • Milestone changed from 3.1 to 3.2

Initial tests using RDTSC did not provide any performance benefit. Will postpone until later. May need to use feature branch for this.

comment:6 Changed 10 years ago by aturner

(In [1878]) create features area. refs #41

comment:7 Changed 10 years ago by aturner

(In [1879]) create performance branch of trunk. refs #41

comment:8 Changed 10 years ago by aturner

(In [1880]) commit work in progress. this doesn't seem to actually work though. :(
Anyways, this just commits my work up to this point, but at some point it's
going to need to be reworked. refs #41

comment:9 Changed 10 years ago by aturner

  • Milestone changed from 3.2 to Future Release

comment:10 Changed 10 years ago by aturner

According to Wikipedia, the RDTSC is inaccurate on multi-cpu/core systems and systems implementing SpeedStep?. Also, the incrementing rate may not equal the clock speed on P4, XEON and Core2Duo CPU's.

Interestingly, recent Linux 2.6 kernels may use a better timer HPET for gettimeofday() calls if the kernel is so configured. Ideally, I should use the HPET, but it looks more OS dependent.

comment:11 Changed 10 years ago by aturner

another option is to try to use (p)select/poll/kqueue, etc for timing. Good chance that they on the back end use high resolution timers.

comment:12 Changed 10 years ago by aturner

comment:13 Changed 9 years ago by anonymous

So I've determined the biggest problem: 1usec resolution is not enough. Some simple math shows that @ 125,000pps, the interval is 8usec. But at 130,000pps, we're looking at 7.7usec. Integer math converts that to 7usec and we end up doing around 141,000pps.

The answer seems to be that use floating point math and then for every 1/10 of a usec above the integer value, you sleep for an extra 1usec. Using the 130Kpps example of 7.7usec, I'd sleep for 7usec 3 times and 8usec 7 times, thus averaging 7.7usec. This seems to be pretty simple for calculating pps, but I still need to come up with an algorithm for more dynamic timing models (Mbps and multipliers). Rounding might be the easiest/best I can do.

comment:14 Changed 9 years ago by aturner

(In [1970]) sync up my test code so far. refs #41

comment:15 Changed 9 years ago by aturner

  • Description modified (diff)
  • Summary changed from replace gettimeofday() and nanosleep() with a calculated loop to Improve packet timing method

comment:16 Changed 9 years ago by aturner

(In [1971]) switch to timespec (nanosecond precision) for sleeping. OS X's AbsoluteTime?
methods are really damn accurate! refs #41

comment:17 Changed 9 years ago by aturner

(In [1972]) writing to the ioport is complex enough it prolly doesn't belong full bore in
the header. refs #41

comment:18 Changed 9 years ago by aturner

(In [1973]) add support for rounding timespec's to usec accuracy for non-absolute timing
methods refs #41

comment:19 Changed 9 years ago by aturner

  • Milestone changed from Future Release to 3.3

comment:20 Changed 9 years ago by aturner

(In [1974]) now I calculate how much time has passed since the last packet was sent, but I
still need to use the sleep-accellerator. refs #41

comment:21 Changed 9 years ago by aturner

(In [1975]) cleanup refs #41

comment:22 Changed 9 years ago by aturner

(In [1976]) minor cleanup. refs #41

comment:23 Changed 9 years ago by aturner

(In [1977]) fix compile issues under linux refs #41

comment:24 Changed 9 years ago by aturner

(In [1978]) minor tweaks refs #41

comment:25 Changed 9 years ago by aturner

calculating the time passage [1974] is broken because it doesn't take into account the amount of time it takes to send the packet which is sufficently non-zero to cause serious problems.

comment:26 Changed 9 years ago by aturner

(In [1979]) fix get_delta_time() to take in account of packet writing time. refs #41

comment:27 Changed 9 years ago by aturner

(In [1980]) fix compile on non-OSX systems. refs #41

comment:28 Changed 9 years ago by aturner

(In [2005]) clean up. refs #41

comment:29 Changed 9 years ago by aturner

(In [2006]) merge features/performance -r1879:2005 to trunk. and cleanup. refs #41

comment:30 Changed 9 years ago by aturner

(In [2008]) we dont use that anymore. refs #41

comment:31 Changed 9 years ago by aturner

(In [2011]) disable to the old accrate test. refs #41

comment:32 Changed 9 years ago by aturner

(In [2014]) default is now gtod() refs #41

comment:33 Changed 9 years ago by aturner

  • Resolution set to fixed
  • Status changed from assigned to closed

done for now i think.

Note: See TracTickets for help on using tickets.