Ticket #41 (closed enhancement: fixed)
Improve packet timing method
| Reported by: | aturner | Owned by: | aturner |
|---|---|---|---|
| Priority: | high | Milestone: | 3.3 |
| Component: | tcpreplay | Version: | 3.0.beta7 |
| Keywords: | Cc: | ||
| Operating System: | Add to FAQ?: | no | |
| Hardware: | All | ||
| Output of tcpreplay -V: | |||
Description (last modified by aturner) (diff)
Create another timing method to replace gettimeofday() and nanosleep() which aren't very granular fast on many platforms.
So I've determined the biggest problem: 1usec resolution is not enough. Some simple math shows that @ 125,000pps, the interval is 8usec. But at 130,000pps, we're looking at 7.7usec. Integer math converts that to 7usec and we end up doing around 141,000pps.
The answer seems to be that use floating point math and then for every 1/10 of a usec above the integer value, you sleep for an extra 1usec. Using the 130Kpps example of 7.7usec, I'd sleep for 7usec 3 times and 8usec 7 times, thus averaging 7.7usec. This seems to be pretty simple for calculating pps, but I still need to come up with an algorithm for more dynamic timing models (Mbps and multipliers). Rounding might be the easiest/best I can do.
Change History
comment:1 Changed 3 years ago by aturner
- Priority changed from medium to high
- Description modified (diff)
comment:3 Changed 3 years ago by aturner
- Add to FAQ? unset
- Summary changed from Improve usability for performance testing to replace gettimeofday() and nanosleep() with a calculated loop
- Description modified (diff)
- Milestone changed from Future Release to 3.1
comment:4 Changed 3 years ago by aturner
- Status changed from new to assigned
A much simpler and accurate method is to read the RDTSC (ReaD Time Stamp Counter) counter for Pentiums:
Of course, this isn't the most cross-platform solution, but PPC/UltraSparc may have similar solutions and they're less popular then x86. Anyways, this looks like the way to go for now.
comment:5 Changed 3 years ago by aturner
- Milestone changed from 3.1 to 3.2
Initial tests using RDTSC did not provide any performance benefit. Will postpone until later. May need to use feature branch for this.
comment:10 Changed 2 years ago by aturner
According to Wikipedia, the RDTSC is inaccurate on multi-cpu/core systems and systems implementing SpeedStep?. Also, the incrementing rate may not equal the clock speed on P4, XEON and Core2Duo CPU's.
Interestingly, recent Linux 2.6 kernels may use a better timer HPET for gettimeofday() calls if the kernel is so configured. Ideally, I should use the HPET, but it looks more OS dependent.
comment:11 Changed 2 years ago by aturner
another option is to try to use (p)select/poll/kqueue, etc for timing. Good chance that they on the back end use high resolution timers.
comment:12 Changed 2 years ago by aturner
interesting trick: http://c-faq.com/osdep/sd25.html
comment:13 Changed 2 years ago by anonymous
So I've determined the biggest problem: 1usec resolution is not enough. Some simple math shows that @ 125,000pps, the interval is 8usec. But at 130,000pps, we're looking at 7.7usec. Integer math converts that to 7usec and we end up doing around 141,000pps.
The answer seems to be that use floating point math and then for every 1/10 of a usec above the integer value, you sleep for an extra 1usec. Using the 130Kpps example of 7.7usec, I'd sleep for 7usec 3 times and 8usec 7 times, thus averaging 7.7usec. This seems to be pretty simple for calculating pps, but I still need to come up with an algorithm for more dynamic timing models (Mbps and multipliers). Rounding might be the easiest/best I can do.
comment:14 Changed 2 years ago by aturner
comment:15 Changed 2 years ago by aturner
- Description modified (diff)
- Summary changed from replace gettimeofday() and nanosleep() with a calculated loop to Improve packet timing method
comment:16 Changed 2 years ago by aturner
(In [1971]) switch to timespec (nanosecond precision) for sleeping. OS X's AbsoluteTime? methods are really damn accurate! refs #41
comment:17 Changed 2 years ago by aturner
comment:18 Changed 2 years ago by aturner
comment:20 Changed 2 years ago by aturner
comment:21 Changed 2 years ago by aturner
comment:22 Changed 2 years ago by aturner
comment:23 Changed 2 years ago by aturner
comment:24 Changed 2 years ago by aturner
comment:25 Changed 2 years ago by aturner
calculating the time passage [1974] is broken because it doesn't take into account the amount of time it takes to send the packet which is sufficently non-zero to cause serious problems.
comment:26 Changed 2 years ago by aturner
comment:27 Changed 2 years ago by aturner
comment:28 Changed 2 years ago by aturner
comment:29 Changed 2 years ago by aturner
(In [2006]) merge features/performance -r1879:2005 to trunk. and cleanup. refs #41
comment:30 Changed 2 years ago by aturner
comment:31 Changed 2 years ago by aturner
comment:32 Changed 2 years ago by aturner
comment:33 Changed 2 years ago by aturner
- Status changed from assigned to closed
- Resolution set to fixed
done for now i think.
