-
Notifications
You must be signed in to change notification settings - Fork 8.2k
[RFC] Introduce Clock Management Subsystem (Clock Driver Based) #72102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[RFC] Introduce Clock Management Subsystem (Clock Driver Based) #72102
Conversation
b4d7f94 to
8cc246a
Compare
Essentially to improve readability, and ideally also to add validation. The big difference between what we have now and what this framework proposes is that a clock node defined for the clock control framework can utilize the
I've looked at them in the context of your MUX RFC, yeah. My understanding is that the macros there are used to encode register writes and the values within one phandle, like so: Were you envisioning a similar usage here, or something different? Clock drivers could definitely define their configuration as a series of register writes, but I'm not sure how that solves the readability/validation problem we have here, unless you were thinking of using those macros differently?
I think the |
FWIW, you don't even need to over-declare clock cells, which would make the bindings ugly: you can just
FTR this suggestion was more of a "this is possible" idea than something serious; as you point out it is definitely a bad idea as labels must be unique in global namespace. (also, I haven't verified whether To summarize the main points of a discussion with @danieldegrasse, the main issues with the
Using properties - as implemented by myself - helps mitigate these issues but also brings some of its own:
We did brainstorm a few ideas to tackle these shortcomings, but nothing has been implemented so far - I plan to revisit this topic when time allows. (which is another reason why going for a Note by the way that the properties-based system would only work for boot-time initialization. Anything dynamic will have to be done through |
|
A few points after reading the documentation:
Overall, the documentation seems fair: there are a few typos here and there, and some things could be reworked a bit, but it gives a reasonable overview. I added some feedback on #94679 to avoid cluttering this PR's discussion feed too much. Somewhat unrelated but another thing that came to my mind is a fairly common situation on STM32: some IPs have two clock inputs ( [1] Not physically but the same bit in RCC controls both gates - see below example where |
No, the PHA macros I add are to put any arbitrary data in phandle specifiers. On that PR, there is 3 drivers being added, and only one of them is "encoding register writes" in the specifier. One of the others is just an index into a list on a different node, and the last is corresponding to a number representing what should the gpio select lines of an external circuit should be. So, I don't know what data would be useful to put in the specifier here, but it doesn't strictly have to be register addresses and values, although that is an option. |
In my opinion , this is a non-issue. We already had plenty of phandle arrays in tree in zephyr for all sorts of things, and not to mention every other project that uses DT such as linux or uboot, and nobody ever complained about this.
In my opinion, the explosion of so many properties and clock bindings in the way this PR is right now, is more of a confusing thing, it's so much chaos to sort through from my perspective. I would like to see some more monolithic clock controller devices and less bindings and properties.
This seems like an implementation problem, not a design problem with using phandle specifiers. Can you give an example of what you mean.
Eventually, I imagine this will turn into a collab branch, but even a collab branch will be sticky because major changes will require reworking every other SOC vendor SOC. So until we have the clear idea that everybody mostly agrees on of what we want to do, since this is such a herculean task, then we shouldn't make any shared branch at all IMO. Now also I am not saying that it's up to Daniel to be reworking this PR until infinity during this whole thing - I would like to see some more people making their own branches and showing PoCs of their ideas, this is something I was going to do if I find time but, yes, this is a herculean task so that is asking a lot. In fact, I haven't had appropriate time to even review this PR.
Exactly another point of using phandle specifiers, you can use |
|
I also think this RFC might be trying to do too much right now. I don't think we should be trying to negotiate the configuration of PLLs and oscillators between different consumers at runtime right now, it's just adding to the complexity a lot. Reconfiguring them based on power states seems like a reasonable goal right now, and is the main reason I see anybody even wanting to reconfigure them. And even tying them into the Zephyr PM domains and runtime device PM to place constraints on the clocks being disabled using pm_device_put/get instead of going through such a complicated callback process, seems like a better idea. Maybe instead of getting lost in implementation discussions, we need to back up here and ask, what even are the specific requirements and use cases of a "clock manager" that we are trying to meet, that we can't do today. Because right now I am confused about that, this discussion has been all over the place with people saying a load of different stuff for over a year now, it doesn't seem focused on specific goals to me. |
It is- but what would adding a function like this accomplish? In my vision, a user would consume the API like this: I'm unclear what status the user would need besides:
I guess we need to figure out if this is a common use case. My expectation was that clocks would usually be turned on at init, and disabled in power management handlers
I'll be honest- I don't think this is something we should support. If a user really wants to optimize their build like this, they can overwrite the dt property of a multiplexer to remove parent clock nodes. I don't think we should allow clock drivers themselves to optimize parents away.
Appreciate the feedback- will work on making updates in response when I have some BW. |
The actual main problem description in this PR is actually solved by just refactoring the clock control system we already have to just use macros in the similar way to how I did in the mux RFC, which makes it totally generic to consumers. And these macros are already merged in tree available to use in a similar fashion. So, we don't need a whole new clock framework just to solve this particular problem.
Same problem as above actually, it's just naming the particular function which requires the opaque information.
What makes them unrealistic with
Ditto again from above, solved by already existing DT macros and already possible consumer paradigms. |
decsny
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not clear what we are even trying to accomplish with a clock manager framework, the problem description in the original post is mostly already achievable by just refactoring the ways consumers retrieve the opaque DT data to use the existing clock control API.
I would like to see a clear list of requirements and use cases that a clock manager framework is actually supposed to be meeting and solving that we cannot do today. I am sure there are some, but we need to be focused on the problems, not lost in our keyboards continually redefining an implementation for something that we are not even clear what is the goal of.
| (DT_NODE_HAS_COMPAT_STATUS(DT_NODELABEL(flexcomm4), nxp_lpc_spi, okay) && \ | ||
| CONFIG_SPI_MCUX_FLEXCOMM) || \ | ||
| (DT_NODE_HAS_COMPAT_STATUS(DT_NODELABEL(flexcomm4), nxp_lpc_usart, okay) && \ | ||
| (CONFIG_UART_MCUX_FLEXCOMM && !defined(CONFIG_CLOCK_MANAGEMENT))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you are keeping the old default clocking code for backwards compatibility if someone with this board doesnt want to use clock management?
It seems like instead of appending && !defined(CONFIG_CLOCK_MANAGEMENT) on every ifdef , you can just wrap the all if it in one #if defined(CONFIG_CLOCK_MANAGEMENT)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, can update this next time I push. Did this for backwards compatibility and to be sure the clock framework was actually configuring the UART clock.
| clock-state-0: | ||
| type: phandles | ||
| description: | | ||
| Clock configuration/s for the first state. Content should be a series of | ||
| references to the clock nodes declared as children of the clock | ||
| controller. The specifier cells to these clock nodes are specific to the | ||
| implementation of the system's clock controller | ||
| clock-state-1: | ||
| type: phandles | ||
| description: | | ||
| Clock configuration/s for the second state. See clock-state-0. | ||
| clock-state-2: | ||
| type: phandles | ||
| description: | | ||
| Clock configuration/s for the third state. See clock-state-0. | ||
| clock-state-3: | ||
| type: phandles | ||
| description: | | ||
| Clock configuration/s for the fourth state. See clock-state-0. | ||
| clock-state-4: | ||
| type: phandles | ||
| description: | | ||
| Clock configuration/s for the fifth state. See clock-state-0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it like this? just put an arbitrary number of phandles on one clock-states property?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did this to match the way that pinctrl is defined. Also, we have the ability to have multiple output clocks per node. So something like the following is possible:
emul_dev1: emul-dev1 {
compatible = "vnd,emul-clock-consumer";
clock-outputs = <&dev1_out_slow &dev1_out_fast>;
clock-output-names = "slow", "fast";
clock-state-0 = <&dev1_10mhz &dev1_100mhz>;
clock-state-1 = <&dev1_3mhz &dev1_25mhz>;
clock-state-names = "default", "sleep";
};
dev1_10mhz corresponds to the default state for clock output dev1_out_slow, and dev1_100mhz corresponds to the default state for clock output dev1_out_fast
I'll be honest, that's the core goal of this framework- I want a clock consumer to be able to request a new clock setting, and have other consumers get notified about that change (and optionally reject it) The other approach (as you mentioned) is monolithic clock power states. I a while back in #70467, using the nomenclature of "setpoints". There was some good discussion there. In essence the use case that drove the move to this framework was something like what is described here: #70467 Essentially:
There isn't really a good way to do this with monolithic power states- you are going to end up with a huge matrix of static clock tree definitions for every dependency like this in a system. |
You're right- the initial post doesn't really describe the problem well. The initial post was also written 1.5 years ago, and last edited 9 months back. The framework has undergone at least 2 major reworks since then in response to feedback. I'll attempt to describe the goals of the framework as it exists now below. If you'd like me to edit the description of the PR itself to integrate this, I can do so. IntroductionThis PR proposes a clock management subsystem. The core goal of the subsystem is to support runtime reconfiguration of the clock tree, by handling dependencies between clock producers and notifying clock consumers as appropriate. This subsystem continues to support static configuration like what is possible using the clock control framework today, but encapsulates SOC clock trees into a series of clock producer drivers, which are opaque to clock consumers and managed by the subsystem itself. Problem descriptionCurrently, we support some level of runtime clock configuration in Zephyr- however this configuration is accomplished by two functions: Beyond this, there is no generic framework for selecting the best clock given a requested rate. This is a relatively generic problem, with an implementation like the following:
If SOC clock drivers want to implement this functionality, they will need to do it for their specific clock tree. Moreover, the fact that clock APIs don't include any sort of "query rate" API means that clock drivers are inherently going to be monolithic, limiting code reuse and precluding generic clock drivers entirely. To be specific, a request that warrants this type of runtime resolution might be something like the following:
In the current clock control driver design, code that should otherwise be generic would need to be placed in the vendor's driver to perform this kind of resolution. With the clock management framework, the resolution might look like this:
Proposed changeThe core of this subsystem is splitting clock producers into "clock drivers". Each driver implements a clock API described here: https://builds.zephyrproject.io/zephyr/pr/94679/docs/hardware/clock_management/index.html#implementation-guidelines. Drivers can be configured using vendor-specific data encoded into devicetree in "clock states", or can be configured by a runtime request. The clock management subsystem handles generic tasks like selecting the best clock producer for a given request, and validating a clock rate with consumers. The subsystem is designed with 3 levels of configuration:
|
I'll use H7 as example again which has three PLLs:
The DTS would be something like: And the initialization sequence: If The advantage of a properties-based approach is that we (as in, whoever implements the SoC side) could enforce a specific initialization order, ensuring that such issues cannot occur.
Besides what I described above, the big reason for a properties-based proposal was also that a single boot-time configuration is what (almost?) everyone uses today in Zephyr - unless I missed it and there's already a platform with dynamic management?
The use case I keep thinking of is the STM32 TRNG driver where we query the clock state to know if driver is being called in re-entrant fashion from ISR (well, really, that's not how it's done today but was how I planned to do it). I guess a "refcount" would be an acceptable alternative though. I must admit I didn't think about using
For context, again this came mostly from the STM32 TRNG driver which automatically starts/stops itself depending on the state of its internal entropy pools. Maybe this implementation is not correct w.r.t. Zephyr's PM framework.
Even ignoring the whole By having the clock tree in DTS instead, you get The downside is that this does increase FLASH footprint: some figures have been posted several times in this thread, the ballpark is around 300~400B on a "simple" product like STM32C0. A change in scope and paradigm can definitely reduce that (e.g., "each DT node must describe operation performed on input clock" ==> solve entire clock tree at build time ==> leave only leaf nodes (clock gates) as instances in the final image + write-what-where sequence executed during initialization); this brings back your legitimate question about "what are we trying to achieve" which may need to be discussed more formally again. |
| * @param ev_type Type of clock notification event | ||
| * @return 0 if notification chain succeeded, or error if not | ||
| */ | ||
| static int clock_notify_children(const struct clk *clk_hw, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no offense, but in my opinion this function is written terribly, there is WAY too much block nesting and cognitive complexity, it needs to be refactored. I suggest looking at the SonarQubeCloud suggestions as it has found 41 issues on this PR which is the most I've ever seen (probably because it is a huge PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I can refactor this into sub functions for each clock type. I'll take a look through the sonarcube- some of those suggestions are very good (I'll admit I initially discounted it when I saw that because it was complaining about recursion, which this PR needs to function)
If the original post is way out of date, then yes, I think you should update it.
I am still confused what is meant by "managed by the subsystem itself", maybe I am just missing something, but I see no subsystem being added on this PR, only a driver class.
I don't agree still with this problem description, this is just an implementation problem, see again what I said above about we can use the now-existing macros to form opaque arrays of arbitrary data from DT. So IMO the existing clock control API does not "fundamentally" have a design which couples consumers to specific drivers. If I am not being clear about this, maybe I can try to make a PR to add some helper macros and show some kind of example (although I admit, refactoring some of these platform to use the clock control API properly, is a big task of its own).
Instead of focusing on the implementation in the problem description, I think it should be describing the requirements of rate requesting. I am not interested in the implementation details when I am reviewing this architecture proposal, I want to know, what exact problem(s) are we trying to solve and what are the characteristics of any solution that would address them.
I don't know what this is supposed to mean, honestly. What does having a query rate API have to do with precluding generic clock drivers and code reuse? What even is a "generic clock driver"? You need to define these terms you're using.
I will say, this sounds very similar to power domains structure that PM subsystem has, and I suggest to take some inspiration from how that is designed, because IMO it has (or is in the process of) being designed very coherently. It is essentially another example of a "constraint based dependency tree manager". @bjarki-andreasen maybe can comment about this.
I am just wondering, have you considered putting clock object or callback in the device model, similar to PM?
For the "basic configuration" I wonder why we can't just use the existing clock control drivers and API, if it's supposed to be the same.
It seems like this can be done by extending the device model with a clock callback and extending the clock control API by having it be allowed to perform this behavior when clock_control_configure is called. And I understand you have a notion of "clock devices" which are miniature polymorphic objects to optimize instead of using full "devices" for every clock. But my comment again has been and still is, that we should not be having a "driver" for every single clock source, mux, div, whatever, in the SOC. The driver should be for a reusable and cohesive IP module from a hardware vendor, not trying to slice it up into all these mini drivers, which, if you're worried about ROM usage, IMO is probably going to be worse. The current clock control drivers are mostly written badly and use #ifdef to control rom usage, but they can be written much more simply if they just made use of DT phandles properly and if consumers did not intentionally couple themself to a driver, when it's not even necessary. I think the issue you are trying to solve is that a lot of current clock control driver, cast an integer into a void *, which is just wrong and not allowing them to actually use the existing flexibility of clock control API at all. So why don't we try to first use the clock control API as it was intended and see what extensions are needed instead of reinventing everything from scratch?
Yeah, this , is more complicated, but again my question still stands from before, which is what are the actual use cases for all of this that we are trying to do. You gave a lot of abstract examples in the form of talking about a theoretical X, Y, A, B, C, okay, but I am not interested in letters and academic derivations, what I want us to be focused on here is what are the actual technical use cases we are trying to accomplish with a clock control or manager framework. WE need to stay focused on these actual goals when we are considering doing this major architectural change, it should not be theoretical and academic in motivation. |
This is not a problem with the concept of using phandles instead of nodes-based hierarchies and properties. This is a problem, again, which I have said, of having WAY too many "clock drivers" because of the way this RFC is designed right now. A clock consumer referencing it's clocks with a
I don't understand, are you trying to say the phandles specifier form would not allow for static configuration? Because that's how we actually are doing static configuration right now for NXP. So I don't get your point here.
You can add reusable code to be shared between drivers without creating a super complex architecture around that. What you just showed, is not an improvement, it is just more confusion. Because these are actually all now their own components and there is so many side effects and cross-component communications going on, it is getting out of hand in my opinion.
Yes, this is exactly one or two of the points I was making, basically. You have to also realize, back to the phandles, that phandles form a tree too. The "child/parent node" DTS structure is specifically meant for describing buses, although it is also commonly used for describing subcomponents of devices, which is fine. But what we are talking about here is describing a tree of inter-device dependencies, which, is not supposed to be solved with child/parent DTS nodes, it's supposed to be solved by phandles. The DT spec clearly talks about use cases of phandles this way, such as the "interrupt tree" which is described through "interrupts" properties. And the way clocks are done in every other project that uses DT is through a "clocks tree" which is described through "clocks" properties. So I think we should be consistent with how DT is used across all projects, and not be doing things differently in a weird unintended way for Zephyr which IMO is not even justifiable. From what I have heard, the arguments for doing this tortured DT scheme do not make any actual sense, and are mostly pointing at design failures of the actual subsystem in code, not DT. The reason you are running into this problem is because you are tightly coupling DT structurally to subsystem implementation, this is wrong, and this is why we always have the mantra of "don't put software configuration in DT". In this case, sure you're trying to describe hardware, but you're trying to do it in a way that is tightly coupled to your "subsystem" implementation (note: you are not even coupling it to the subsystem design, but actually to the implementation, this is horribly terrible). Each DT node should describe a modular IP, and have a driver that can program to that hardware interface. If there is any dependencies between them they should be expressed by phandles. Having nodes for every single mux, div, gate, and bit within an IP, is completely not the point of what we should be doing in DT, it's totally out of line in my opinion. |
|
Just to show, I'm not just being hand wavy about what I'm talking about with how we can already solve the opaqueness/coupling problem by just adding some useful macrobatics, look at this WIP PR: #96315 All I did with it so far is hello world on frdm_rw612, it's totally incomplete, but I am probably going to push this PR to completion. And it is not going to solve most of the problems we are talking about here but my point is we do not need a gigantic architectural rework just to solve the issue of consistently stupid implementation problems in existing clock control drivers, which is what was CLEARLY the original motivation of this PR (besides the original post focusing on that so much, I KNOW that was @danieldegrasse original motivation because when he was at NXP that is the main problem we were having that motivated him to even be looking at it in the first place). As far as the new motiviations that have cropped up and expanded the scope of this architecturally hyperinflated discussions, I want to know what are the actual use case that is motivating them, because all I have seen in the last year is a bunch of silicon vendors pontificating academically about all the things they could theoretically do with their clock trees and musing over implementation after implementation and architecture after architecture, with no clear focused guiding requirements, IMO that is not how we should be approaching this AT ALL. And why am I not hearing any alternative proposal that actually considered, what if we just try incrementally extending appropriately and actually implementing correctly the existing clock control API? I mean this is just looking to me like a classic example of software developer natural tendency (not really the best link I could find on the phenomenon but I think anybody that has been in the industry for any amount of time knows what I'm talking about:) to want to "rewrite" an entirely new "glorious" thing from scratch instead of fix and improve the existing stuff. |
Maybe it is time to restructure the code then. In the initial implementation of this PR, most of the clock management logic resided in drivers, and a few common helpers were defined in Would it make sense to instead split the code in
We can define some form of generic macro that creates a data type to pass to these functions which is more generic, but the APIs themselves place no restriction on implementers- a vendor can choose to use these generic macros, or they can bypass them entirely. The key point is that the consumer facing API takes opaque arguments, which by definition makes it possible to leak implementation details between the clock producer and consumer.
Perhaps this explanation dives too deep into implementation, sure. I think the core problem is runtime resolution, which I described in the comment above (see the section about consumer X and Y)
I don't agree with the argument here. The clock tree of an SOC does not constitute a single "reusable or cohesive IP module". It is a set of reusable IP blocks. Let's take the clock tree of the LPC55S69 versus the iMXRT595 as an example:
If we want to maximize code reuse, we need to split drivers into components:
We then can reuse the divider and multiplexer drivers across the clock tree for each SOC.
I don't have a concrete example for you unfortunately, this was a usecase that came from @hubertmis. I would love if @bjarki-andreasen or @hubertmis could chime in with a specific case here. NXP parts don't have a usecase like this I am aware of. Here's a marginally more concrete example though:
I want to be clear here. This is the type of usecase the framework is designed to solve in its current form. Static boot time configuration is a problem we can solve a myriad of other (simpler) ways. The only goals this framework has for static configuration are:
I appreciate you doing this- it is good to try to move things forward in incremental steps, and it is absolutely possible to solve NXP's static clock configuration issue without the clock management PR. It would be great to see that move forwards. As I hope is clear, this PR goes well beyond just solving NXP's static clock configuration problem, which was indeed the initial reason I started work on this- at this point though the proposal is for a subsystem, and a subsystem needs to work for more usecases than NXP's alone.
I'm unclear what you mean here. #70467 was an alternate proposal. It was considered. If you'd like to make another alternate proposal, please do- there is no need for this PR to be the only proposed solution. This PR has also gone through several reworks- some large enough I would argue they constitute a new proposed approach to the problem. Based on your comments I'm concerned you are focused on this PR as just a static clock framework. It isn't that- if it was, this would be vastly overcomplicated. I've implemented the runtime features that exist in response to feedback from @bjarki-andreasen and @hubertmis. If either of you could add a concrete example to help highlight why we need runtime rate resolution features, that might be helpful |
|
I agree that current separation of individual "mux" and "div" are way too complicated. MPUs can have 100+ clock cells, resulting in giant clock trees in dts.
Usually the code reuse is limited within vendor's scope, would it be possible to design an higher level abstraction in the subsystem's view, and allow for vendor specific implementation for clock cell components? I think we are not expecting clock config being portable across vendors anyway. In my concern, the |
FTR, there is probably a bit of confusion here: the zephyr/boards/nxp/lpcxpresso55s69/lpcxpresso55s69_lpc55s69_cpu0.dts Lines 131 to 143 in 7764d24
Each device should indeed only refer to its parents, I was not suggesting otherwise.
No, I was merely trying to say that static initialization is presumably what most people use, hence why we should try to make it as easy to use as possible (e.g., properties-based).
That's kinda what we have with
Which side effects and cross-component communications? Drivers don't talk to each other, the subsystem manages that now.
That was one of the solutions I was thinking about to solve the initialization order issue, actually - having properties somewhere to describe that some nodes must be init'ed in a specific order relative to each other. Though, for this very instance, since the PLL configuration is (mostly) locked after being enabled, the entire configuration could be done by the PVCO node (the outputs would still need separate nodes). This doesn't apply to all series; some of them allow specific outputs to be disabled on-the-fly. |
|
To summarize the requirements from Nordic:
For us, just defining every clock as a device, and having every configuration it can be in as child nodes we can select, we have all we need. On our hardware, the "modes" do not affect frequency, only accuracy, so we just need to be able to specify an output frequency from drivers, the application can then determine, at runtime, which power mode to use. |
Sure- a vendor is already able to implement clocks within this framework at less granularity than I described. It is possible to simply implement an entire clock tree as a series of "root clocks" if desired. The downside to using this approach is that the vendor will be responsible for handling negotiations between clocks, since the framework won't know how the tree is structured. One thing I'll note- the API as proposed here is now very close to the Linux common clock framework, which is a proven working framework on SOCs much more complex than Zephyr supports. The main difference is that we describe our clock tree in DTS, while Linux describes the tree in C. I chose this approach because we need the ability to reference individual nodes in DTS to configure them, which isn't really possible if we only have the clock tree defined in C. Essentially our runtime configuration API (set_rate and similar) is the same as the Linux CCF, and the only variation is that we support this |
Yeah probably I would prefer to see it this way, and would lead the development (both on this PR and in the future) also down a path which I think is more sensible, which is to have a more central clock managing logic.
I don't agree with this really that coupling is required with void *. My experience with In fact, I guess that is actually what I am trying to do on the PR I linked before, I am proposing a unification of the clock control API so that all client and implementation codes for clock control know the same format - the arbitrary DT spec format, which is the most general way to pass around the data we are actually trying to pass around - the DT specifier cells. So maybe you are right that void * was not an ideal choice here but I think the existing API can be made generic and unified with some trivial (but monotonous) refactoring. Actually it's a good use case for AI.
Again I don't think opaque argument leaks implementation details at all actually, that's the meaning of opaque. What we are seeing actually is that this current void * usage is NOT being used as opaque. An alternative to the PR I posted above is actually doing something similar to pinctrl_soc.h where, instead of a unified data scheme like the clock_control_dt_spec, it could be that each SOC implement it's own clock getter and definer and type, just like with pinctrl.
I am still looking to focus this RFC on use cases and not theoretical situations.
First of all, at no point did I say that the "entire clock tree of an SOC is a single reusable IP module". I said, that any IP which are reuseable (even if they are just different configurations of the same IP) should be getting a driver and compatible. Which I would expect that there would be multiple clock drivers per SOC.
Because they are both syscon "clkctl"/CCM (clock control module) IP. So what I am saying is there should be one driver for the "clkctl", not a driver for a mux and a driver for a div. I know the LPC doesn't call it that, and groups it in the same chapter and same memory map, but I can tell you a secret which is not surprising which is if I look at the actual chip data, I can clearly see there is a "CCM" IP and a "PLL" IP which are separate. So what I am saying is that we should not need multiple drivers for one IP, such as the CCM. But yes, the PLL would be a different driver than the CCM.
That's fine again, the platform differences are as always going to be described in DT. We are not debating if the different platforms are different, but how to describe them in DT.
I think driver for the CCM, driver for the LPC PLL, and driver for the iMX PLL.
What I am trying to see is if we can instead of replacing (even with feature parity) the existing clock control, is if we can improve the clock control and build on top of it instead. I am not saying, to throw out everything in this PR, but I would like to see if it can be adjusted to be simpler, and not requiring a switch and deprecation.
cc @EmilioCBen (he was wanting to know if you will block my PR)
I am completely aware you are going beyond the original motivation, and I am not saying not to try to do that, my points I am making here are basically these things:
You are right, which is what I said above, I would like to see some alternative proposals from more people and I considered doing something myself. However, the reason I said what you quoted here is because if you are saying we need to deprecate the old APIs instead of improving/extending them, I feel like you need a justification for saying why we can't be considering doing that. And the old one you linked also was trying to replace it.
No, I'm not focusing it on that, in fact, I want to solve the static clocking thing separately in the meantime, and what I am actually saying here is that I want this PR to be more focused on the dynamic clocking, with intentionality around specific use cases. And it would be nice if it can build on top of some intermediate work that solves the static problem, but this is where I AM being handwavey, because I do not know how that would look. |
One thing I will point out is that there is already a set_rate function in the current clock control API. But it suffers from the same problem as the configure function. Yet I think it can also be solved in similar ways. How I would like to see this PR adding value is by creating a subsystem layer that lets the clock control drivers call on a clock manager subsystem to ensure the clock values they get requested is OK, then they are responsible for actually changing the hardware state. |
I think the problem here is really this granularity: the framework explicitly defines standard clock device, mux clock device and root clock device. Many SoCs are designed to have 1 gate, 1 mux and 1 divider per IP to allow maximum flexibility. This framework enforces 3 separate DT nodes instead of 1. I'd like to see composed clock devices for simpler DT hierarchy, less layers of callbacks as well as code size reduction (possibly). Some other comments: For external clocks like Also added comments for |
| struct clock_management_event { | ||
| /** Type of event */ | ||
| enum clock_management_event_type type; | ||
| /** Old clock rate */ | ||
| clock_freq_t old_rate; | ||
| /** New clock rate */ | ||
| clock_freq_t new_rate; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some clock drivers requires RPC to another "management" core in order to get their current config (e.g. mux option or divider), which can be slow compared to direct register access. During the 1st call of CLOCK_MANAGEMENT_QUERY_RATE_CHANGE event, the underlying driver has already gathered some info. Would it be possible to add some kind of hint that subsequent CLOCK_MANAGEMENT_PRE_RATE_CHANGE and CLOCK_MANAGEMENT_POST_RATE_CHANGE event for the same clock device comes from the same request, and the info gathered from first call could be reused?
I'm not super familiar with NXP HW but I assume this and somehow the (Such an hybrid approach of one-driver-manages-many-devices could be interesting, I'm just having trouble seeing how it could be done with today's tools)
There is still room for improvement within the Clock Control implementation, but some things will require more than little API-compatible edits. For example, the In the end, By the way, with regard to footprint, I would like to point out that - at least for STM32 - the results are somewhat biased; we are "cheating" by effectively having two different formats for the clock cells, depending on whether they will be provided to [1] The usual
Bring back to the same point: the separation will be done, the question is whether it should be in DTS or in C code (Linux does it in C code - example). It is true that DTS has a downside of being quite verbose, but @danieldegrasse demonstrated that it could be generated automatically via scripting (though one may argue you could generate C code this way too).
+
I do have a similar mindset, i.e. focus should be placed on having a solid framework for static configuration but also designed such that one may easily "plug in" a management framework on top if desired; whether this framework should be implemented completely, partially or not at all by Zephyr being another topic; the "plug in" mechanism must however be thought about carefully. |
| # Copyright 2024 NXP | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| compatible: "clock-state" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs to be prefixed with zephyr, since it is not describing hardware at all but only is a mechanism for doing programming in DT.
| may also be defined as children of these nodes, and can be applied directly | ||
| by consumers. | ||
| compatible: "clock-output" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this similarly should probably be prefixed with zephyr, , even though it is technically descriptive of hardware, it's only needed to host the clock-state nodes, and therefore is unique to zephyr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're at, maybe also worth renaming as the output usage is counter intuitive: We're talking about the entry clock of the node defining this property.


Introduction
This PR proposes a clock management subsystem. It is opened as an alternative to #70467. The eventual goal would be to replace the existing clock control drivers with implementations using the clock management subsystem. This subsystem defines clock control hardware within the devicetree, and abstracts configuration of clocks to "clock states", which reference the clock control devicetree nodes in order to configure the clock tree.
Problem description
The core goal of this change is to provide a more device-agnostic way to manage clocks. Although the clock control subsystem does define clocks as an opaque type, drivers themselves still often need to be aware of the underlying type behind this opaque definition, and there is no standard for how many cells will be present on a given clock controller, so implementation details of the clock driver are prone to leaking into drivers. This presents a problem for vendors that reuse IP blocks across SOC lines with different clock controller drivers.
Beyond this, the clock control subsystem doesn't provide a simple way to configure clocks.
clock_control_configureandclock_control_set_rateare both available, butclock_control_configureis ripe for leaking implementation details to the driver, andclock_control_set_ratewill likely require runtime calculations to achieve requested clock rates that aren't realistic for small embedded devices (or leak implementation details, ifclock_control_subsys_rate_tisn't an integer)Proposed change
This proposal provides the initial infrastructure for clock management, as well as an implementation on the LPC55S69 and an initial driver conversion for the Flexcomm serial driver (mostly for demonstration purposes). Long term, the goal would be to transition all SOCs to this subsystem, and deprecate the clock control API. The subsystem has been designed so it can exist alongside clock control (much like pinmux and pinctrl) in order to make this transition smoother.
The application is expected to assign clock states within devicetree, so the driver should have no awareness of the contents of a clock state, only the target state name. Clock outputs are also assigned within the SOC devicetree, so drivers do not see the details of these either.
In order to fully abstract clock management details from consumers, the clock management layer is split into two layers:
This split is required because not all applications want the flash overhead of enabling runtime rate resolution, so clock states need to be opaque to the consumer. When a consumer requests a rate directly via
clock_mgmt_req_rate, the request will be satisfied by one of the predefined states for the clock, unless runtime rate resolution is enabled. Consumers can also apply states directly viaclock_mgmt_apply_state.Detailed RFC
Clock Management Layer
The clock management layer is the public API that consumers use to interact with clocks. Each consumer will have a set of clock states defined, along with an array of clock outputs. Consumers can query rates of their output clocks, and apply new clock states at any time.
The Clock Management API exposes the following functions:
clock_mgmt_get_rate: Reads a clock rate from a given clock output in Hzclock_mgmt_apply_state: Applies a new clock state from those defined in the consumer's devicetree nodeclock_mgmt_set_callback: Sets a callback to fire before any of the clock outputs defined for this consumer are reconfigured. A negative return value from this callback will prevent the clock from being reconfigured.clock_mgmt_disabled_unused: Disable any clock sources that are not actively in useclock_mgmt_req_rate: Request a frequency range from a clock outputA given clock consumer might define clocks states and outputs like so:
Note that the cells for each node within the
clocksproperty of a state are specific to that node's compatible. It is expected that this values will be used to configure settings like multiplexer selections or divider values directly.The consumer's driver would then interact with the clock management API like so:
Requesting Clock Rates versus Configuring the Clock Tree
Clock states can be defined using one of two methods: either clock rates can be requested from using
clock_mgmt_req_rate, or states can be applied directly usingclock_mgmt_apply_state. IfCONFIG_CLOCK_MGMT_SET_RATEis enabled, clock rate requests can also be handled at runtime, which may result in more accurate clocks for a given request. However, some clock configurations may only be possibly by directly applying a state usingclock_mgmt_apply_state.Directly Configuring Clock Tree
For flash optimization or advanced use cases, the devicetree can be used to configure clock nodes directly with driver-specific data. Each clock node in the tree defines a set of specifiers within its compatible, which can be used to configure node specific settings. Each node defines two macros to parse these specifiers, based on its compatible: Z_CLOCK_MGMT_xxx_DATA_DEFINE and Z_CLOCK_MGMT_xxx_DATA_GET (where xxx is the device compatible string as an uppercase token). The expansion of Z_CLOCK_MGMT_xxx_DATA_GET for a given node and set of specifiers will be passed to the clock_configure function as a void * when that clock state is applied. This allows the user to configure clock node specific settings directly (for example, the precision targeted by a given oscillator or the frequency generation method used by a PLL). It can also be used to reduce flash usage, as parameters like PLL multiplication and division factors can be set in the devicetree, rather than being calculated at runtime.
Defining a clock state that directly configures the clock tree might look like so:
This would setup the
mydev_muxandmydev_divusing hardware specific settings (given by their specifiers). In this case, these settings might be selected so that the clock output ofmydev_clock_sourcewould be 1MHz.Runtime Clock Requests
When
CONFIG_CLOCK_MGMT_RUNTIMEis enabled, clock requests issued viaclock_mgmt_req_ratewill be aggregated, so that each request from a consumer is combined into one set of clock constraints. This means that if a consumer makes a request, that request is "sticky", and the clock output will reject an attempt to reconfigure it to a range outside of the requested frequency. For clock states in devicetree, the same "sticky" behavior can be achieved by adding thelocking-stateproperty to the state definition. This should be done for states on critical clocks, such as the CPU core clock, that should not be reconfigured due to another consumer applying a clock state.Clock Driver Layer
The clock driver layer describes clock producers available on the system. Within an SOC clock tree, individual clock nodes (IE clock multiplexers, dividers, and PLLs) are considered separate producers, and should have separate devicetree definitions and drivers. Clock drivers can implement the following API functions:
notify: Called by parent clock to notify child it is about to reconfigure to a new clock rate. Child clock can return error if this rate is not supported, or simply calculate its new rate and forward the notification to its own childrenget_rate: Called by child clock to request frequency of this clock in Hzconfigure: Called directly by clock management subsystem to reconfigure the clock node. Clock node should notify children of its new rateround_rate: Called by a child clock to request best frequency in Hz a parent can produce, given a requested target frequencyset_rate: Called by a child clock to set parent to best frequency in Hz it can produce, given a requested target frequencyTo implement these APIs, the clock drivers are expected to make use of the clock driver API. This API has the following functions:
clock_get_rate: Read the rate of a given clockclock_round_rate: Get the best clock frequency a clock can produce given a requested target frequencyclock_set_rate: Set a clock to the best clock frequency it can produce given a requested target frequency. Also callsclock_lockon the clock to prevent future reconfiguration by clocks besides the one taking ownershipclock_notify_children: Notify all clock children that this clock is about to reconfigure to produce a new rate.clock_children_check_rate: Verify that children can accept a new rateclock_children_notify_pre_change: Notify children a clock is about to reconfigureclock_children_notify_post_change: Notify children a clock has reconfiguredAs an example, a vendor multiplexer driver might get its rate like so:
SOC devicetrees must define all clock outputs in devicetree. This approach is required because clock producers can reference each node directly in a clock state, in order to configure the clock tree without needing to request a clock frequency and have it applied at runtime.
An SOC clock tree therefore might look like the following:
Producers can provide specifiers when configuring a node, which will be used by the clock subsystem to determine how to configure the clock. For a clock node with the compatible
vnd,clock-compat, the followingmacros must be defined:
Z_CLOCK_MGMT_VND_CLOCK_COMPAT_DATA_DEFINE: Defines any static data that is needed to configure this clockZ_CLOCK_MGMT_VND_CLOCK_COMPAT_DATA_GET: Gets reference to previously defined static data to configure this clock. Cast to a void* and passed toclock_configure. For simple clock drivers, this may be the only definition needed.For example, the
vnd,clock-muxcompatible might have one specifier: "selector", and the following macros defined:The value that
Z_CLOCK_MGMT_VND_CLOCK_MUX_DATA_GETexpands to will be passed to theclock_configureAPI call for the driver implementing thevnd,clock-muxcompatible. Such an implementation might look like the following:A clock state to set the
mydev_clock_muxto use thepllclock as input would then look like this:Note the
mydev_clock_sourceleaf node in the clock tree above. These nodes must exist as children of any clock node that can be used by a peripheral, and the peripheral must reference themydev_clock_sourcenode in itsclock-outputsproperty. The clock management subsystem implements clock drivers for nodes with theclock-outputcompatible, which handles mapping the clock management APIs to internal clock driver APIs.Framework Configuration
Since the clock management framework would likely be included with every SOC build, several Kconfigs are defined to enable/disable features that will not be needed for every application, and increase flash usage when enabled. These Kconfig are the following:
CONFIG_CLOCK_MGMT_RUNTIME: Enables clocks to notify children of reconfiguration. Needed any time that peripherals will reconfigure clocks at runtime, or ifclock_mgmt_disable_unusedis used. Also makes requests from consumers toclock_mgmt_req_rateaggregate, so that if a customer makes a request that the clock accepts it is guaranteed the clock will not change frequency outside of those parameters.CONFIG_CLOCK_MGMT_SET_RATE: Enables clocks to calculate a new rate and apply it at runtime. When enabled,clock_mgmt_req_ratewill useruntime rate resolution if no statically defined clock states satisfy a request. Also enables
CONFIG_CLOCK_MGMT_RUNTIME.Dependencies
This is of course a large change. I'm opening the RFC early for review, but if we choose this route for clock management we will need to create a tracking issue and follow a transition process similar to how we did for pin control.
Beyond this, there are a few key dependencies I'd like to highlight:
Flash usage
NOTE: these numbers are subject to change! This is simply present to provide a benchmark of the rough flash impact of the clock framework with/without certain features
The below builds all configure the clock tree to output 96MHz using the internal oscillator to drive the
flexcomm0serial, and configure the LPC55S69's PLL1 to output a core clock at 144MHz (derived from the 16MHz external crystal)Concerns and Unresolved Questions
I'm unsure what the implications of requiring a 2 stage link process for all builds with the clock control framework that have runtime clocking enabled will be for build/testing time overhead.
In many ways, the concept of clock states duplicates operating points in Linux. I'm not sure if we want to instead define clock states as operating points. The benefit of this would be for peripherals (or CPU cores) that support multiple operating points and use a regulator to select them, since we could define the target voltage with the clock state.
Currently, we aggregate clock requests sent via
clock_mgmt_req_ratewithin the clock output driver, and the clock output driver will handle rejecting any attempt to configure a frequency outside of the constraints that have been set on it. While this results in simple application usage, I am not sure if it would instead be better to rely on consumers to reject rates they cannot handle. An approach like this would likely use less flash.Currently we also issue callbacks at three points:
We could potentially issue only one callback, directly before reconfiguring the clock. The question here is if this would satisfy all use cases, or are there consumers that need to take a certain action right before their clock reconfigures, and a different action after? One example I can think of is a CPU that needs to raise core voltage before its frequency rises, but would reduce core voltage after its core frequency drops
Alternatives
The primary alternative to this PR would be #70467. That PR implements uses functions to apply clock states, while this PR implements the SOC clock backend using a method much more similar to the common clock framework, but wraps the implementation in a clock management subsystem using states similar to how #70467 does. This allows us to work around the "runtime rate setting" issue, since this feature can now be optional