Ticket #41 (closed enhancement: fixed)

Opened 4 years ago

Last modified 23 months ago

Improve packet timing method

Reported by: aturner Owned by: aturner
Priority: high Milestone: 3.3
Component: tcpreplay Version: 3.0.beta7
Keywords: Cc:
Operating System: Add to FAQ?: no
Hardware: All
Output of tcpreplay -V:

Description (last modified by aturner) (diff)

Create another timing method to replace gettimeofday() and nanosleep() which aren't very granular fast on many platforms.

So I've determined the biggest problem: 1usec resolution is not enough. Some simple math shows that @ 125,000pps, the interval is 8usec. But at 130,000pps, we're looking at 7.7usec. Integer math converts that to 7usec and we end up doing around 141,000pps.

The answer seems to be that use floating point math and then for every 1/10 of a usec above the integer value, you sleep for an extra 1usec. Using the 130Kpps example of 7.7usec, I'd sleep for 7usec 3 times and 8usec 7 times, thus averaging 7.7usec. This seems to be pretty simple for calculating pps, but I still need to come up with an algorithm for more dynamic timing models (Mbps and multipliers). Rounding might be the easiest/best I can do.

Change History

Changed 3 years ago by aturner

  • priority changed from medium to high
  • description modified (diff)

Changed 3 years ago by aturner

  • description modified (diff)

Changed 3 years ago by aturner

  • add_to_faq unset
  • summary changed from Improve usability for performance testing to replace gettimeofday() and nanosleep() with a calculated loop
  • description modified (diff)
  • milestone changed from Future Release to 3.1

Changed 3 years ago by aturner

  • status changed from new to assigned

A much simpler and accurate method is to read the RDTSC (ReaD Time Stamp Counter) counter for Pentiums:

Of course, this isn't the most cross-platform solution, but PPC/UltraSparc may have similar solutions and they're less popular then x86. Anyways, this looks like the way to go for now.

Changed 3 years ago by aturner

  • milestone changed from 3.1 to 3.2

Initial tests using RDTSC did not provide any performance benefit. Will postpone until later. May need to use feature branch for this.

Changed 3 years ago by aturner

(In [1878]) create features area. refs #41

Changed 3 years ago by aturner

(In [1879]) create performance branch of trunk. refs #41

Changed 3 years ago by aturner

(In [1880]) commit work in progress. this doesn't seem to actually work though. :( Anyways, this just commits my work up to this point, but at some point it's going to need to be reworked. refs #41

Changed 3 years ago by aturner

  • milestone changed from 3.2 to Future Release

Changed 2 years ago by aturner

According to  Wikipedia, the RDTSC is inaccurate on multi-cpu/core systems and systems implementing SpeedStep?. Also, the incrementing rate may not equal the clock speed on P4, XEON and Core2Duo CPU's.

Interestingly, recent Linux 2.6 kernels may use a better timer  HPET for gettimeofday() calls if the kernel is so configured. Ideally, I should use the HPET, but it looks more OS dependent.

Changed 2 years ago by aturner

another option is to try to use (p)select/poll/kqueue, etc for timing. Good chance that they on the back end use high resolution timers.

Changed 2 years ago by aturner

Changed 2 years ago by anonymous

So I've determined the biggest problem: 1usec resolution is not enough. Some simple math shows that @ 125,000pps, the interval is 8usec. But at 130,000pps, we're looking at 7.7usec. Integer math converts that to 7usec and we end up doing around 141,000pps.

The answer seems to be that use floating point math and then for every 1/10 of a usec above the integer value, you sleep for an extra 1usec. Using the 130Kpps example of 7.7usec, I'd sleep for 7usec 3 times and 8usec 7 times, thus averaging 7.7usec. This seems to be pretty simple for calculating pps, but I still need to come up with an algorithm for more dynamic timing models (Mbps and multipliers). Rounding might be the easiest/best I can do.

Changed 2 years ago by aturner

(In [1970]) sync up my test code so far. refs #41

Changed 2 years ago by aturner

  • description modified (diff)
  • summary changed from replace gettimeofday() and nanosleep() with a calculated loop to Improve packet timing method

Changed 2 years ago by aturner

(In [1971]) switch to timespec (nanosecond precision) for sleeping. OS X's AbsoluteTime? methods are really damn accurate! refs #41

Changed 2 years ago by aturner

(In [1972]) writing to the ioport is complex enough it prolly doesn't belong full bore in the header. refs #41

Changed 2 years ago by aturner

(In [1973]) add support for rounding timespec's to usec accuracy for non-absolute timing methods refs #41

Changed 2 years ago by aturner

  • milestone changed from Future Release to 3.3

Changed 2 years ago by aturner

(In [1974]) now I calculate how much time has passed since the last packet was sent, but I still need to use the sleep-accellerator. refs #41

Changed 2 years ago by aturner

(In [1975]) cleanup refs #41

Changed 2 years ago by aturner

(In [1976]) minor cleanup. refs #41

Changed 2 years ago by aturner

(In [1977]) fix compile issues under linux refs #41

Changed 2 years ago by aturner

(In [1978]) minor tweaks refs #41

Changed 2 years ago by aturner

calculating the time passage [1974] is broken because it doesn't take into account the amount of time it takes to send the packet which is sufficently non-zero to cause serious problems.

Changed 2 years ago by aturner

(In [1979]) fix get_delta_time() to take in account of packet writing time. refs #41

Changed 2 years ago by aturner

(In [1980]) fix compile on non-OSX systems. refs #41

Changed 23 months ago by aturner

(In [2005]) clean up. refs #41

Changed 23 months ago by aturner

(In [2006]) merge features/performance -r1879:2005 to trunk. and cleanup. refs #41

Changed 23 months ago by aturner

(In [2008]) we dont use that anymore. refs #41

Changed 23 months ago by aturner

(In [2011]) disable to the old accrate test. refs #41

Changed 23 months ago by aturner

(In [2014]) default is now gtod() refs #41

Changed 23 months ago by aturner

  • status changed from assigned to closed
  • resolution set to fixed

done for now i think.

Note: See TracTickets for help on using tickets.