-
Notifications
You must be signed in to change notification settings - Fork 7.7k
Description
Describe the bug
In the current Zephyr code base, there is currently inconsistent handling of the rate parameter passed into the PTP clock API in the ptp_clock_rate_adjust function. It appears this is how we ended up in the current state of affairs:
- Originally, drivers all appear to have treated this rate as a relative adjustment (i.e. compared to the last rate set), not an absolute adjustment (i.e. relative to the hardware baseline increment rate). For example, if the rate starts at 1.0, setting a rate of 0.5 followed by a rate of 2.0 would result in the rate being set to 1.0 again. This was not clear from the how the API was documented however.
- Existing PTP clock drivers in the Zephyr tree also treated this as a relative adjustment (usually requiring the driver to track the last absolute rate value and multiply it by the newly applied rate before changing the rate increment in hardware). Examples: eth_e1000, eth_stm32_hal, ptp_clock_nxp_enet.
- The gPTP implementation was using the API accordingly, i.e. the neighbor rate ratio was passed directly into ptp_clock_rate_adjust, so it was assuming the rate adjustment applied would be relative.
- In Support gPTP sync with PI control and fix i.MXRT1060 support #89234, @yangbolu1991 changed the gPTP implementation to add in a PI servo to adjust the clock rate based on the time offset value, which then resulted in this value now being expected to be an absolute adjustment. The ptp_clock_nxp_enet driver was updated accordingly, but not any of the other PTP clock drivers.
- The PTP implementation was also updated to treat this as a relative adjustment in net: ptp: support frequency adjustment with PI control algorithm #89517
The result is that any PTP clock drivers that have not been updated to treat the rate adjustment as absolute rather than relative are now broken. We ran into this with an out-of-tree driver when updating the Zephyr code base, however it appears that a number of other in-tree drivers like eth_e1000, eth_stm32_hal etc. would also be impacted. The symptom we observed from this being incorrect in our driver was that the gPTP timing control became unstable, with the rate adjustment starting to oscillate increasingly rather than converging on a stable value.
The changes @yangbolu1991 was making seem to be worthwhile in terms of increasing timing control stability, so I would suggest the following to resolve this situation and avoid further issues:
- Update all other in-tree drivers to treat PTP clock rate adjustment as relative to the hardware baseline rate, not the previously set rate
- Update PTP clock API documentation to make this expectation clear
Regression
- This is a regression.
Steps to reproduce
We reproduced this with an out-of-tree driver and I don't have access to the hardware for some of the other drivers that appear to be affected, this was determined via code inspection.
Relevant log output
Impact
Major – Severely degrades functionality; workaround is difficult or unavailable.
Environment
No response
Additional Context
No response