Fixing `RTLIL::Const`

I filed an issue on this topic and then a PR but maybe it needs a Discourse thread, so here we are.

Working on OptMergePass performance, I noticed that OptMergeWorker::hash_cell_parameter() is slow. The problem is that even common cells like $and have five parameters (3 int, 2 boolean), each of those parameters is stored as a Const containing an std::vector of 32 States, and computing the hash of each of those Consts is a loop hashing the 32 States one at a time.

Taking a step back, Const internally stores either a std::vector<State> (one byte per RTLIL bit) or an std::string (one byte per 8 RTLIL bits), but most operations call bitvectorize() to revert to the std::vector<State> representation. Worse, a lot of code calls Const::bits() to get access to the internal std::vector<State> directly for reading and even writing.

My PR fixes this by deprecating Const::bits() and converting all in-tree code to use alternative Const methods, some of which I added to make this possible (e.g. a non-const iterator over bits). Then I converted various methods to support the std::string representation directly without bitvectorizing (including hashing), and made the Const(long long value, int width) constructor use the std::string representation when the width is a multiple of 8.

One of the commits speeds up string hashing by hashing 8 bytes at a time instead of 1. That may be of independent interest.

There are a lot of changes but none of them are very complicated. FWIW after writing the code and self-reviewing it, running Yosys tests only revealed a single bug, so the number of latent bugs remaining should be very low.

I could submit that entire PR for review, but if people would prefer me to break it into multiple PRs for review and merging, that’s fine too. I’d appreciate any guidance on this.

If after some deprecation period we’re able to remove Const::bits() altogether, that would unlock the potential for more optimizations (although they might not matter).

This is a change I’d welcome and fits into the larger pattern of reducing the amount of RTLIL implementation details that we expose as part of the public API to gain more flexibility needed for optimizing performance or memory usage or to be able to add new functionality that requires observing changes to RTLIL. (e.g., IIRC, before we added the std::string representation for Const the bits field itself was public and there are still many other places left where fields are exposed that, IMO, shouldn’t be).

I think we don’t need to be overlay cautious with introducing new APIs that avoid exposing internals in addition to what’s currently there, and I also don’t mind relatively aggressive deprecation of the old APIs.

I’d anticipate there to be more, similarly motivated changes to other parts of RTLIL, so for actual removal of deprecated APIs, I’d prefer if we can batch as many of those where the required downstream changes are mostly mechanical. It’s less work if you can take care of all of them in a single pass over your code. If we make a breaking change that requires non-trivial changes to downstream code, though, I’d prefer that to be the only breaking change in the corresponding release, since that becomes easier to do if you’re not distracted by unrelated things breaking simultaneously.

so for actual removal of deprecated APIs, I’d prefer if we can batch as many of those where the required downstream changes are mostly mechanical.

I think you mean that deprecation of APIs should happen simultaneously, at least approximately, but removal should be one at a time, right?

That makes a lot of sense. I’ll have a look at what else we might want to deprecate.

That wasn’t what I meant, but that also makes sense, just in a different context than I was thinking of.

I was taking a pessimistic perspective on maintenance. E.g. a project that has a yosys plugin that is only a small part of the overall project with the plugin being considered feature complete and most developers not necessarily familiar with yosys internals. I think there’s a good chance that such a project might be inclined to ignore deprecation warnings for some time and might only be prompted to refactor their yosys plugin to the new APIs when updating yosys starts breaking their build.

Even then, what I wrote isn’t correct, since I think a better model for such projects is that they will wait until things break, but when that happens they will fix everything that is deprecated at that time. So for this it doesn’t really matter whether removals are batched (nor whether deprecations are), but rather how often things break when following this “strategy”.

For projects that are more active in addressing uses of deprecated APIs, what you wrote makes more sense. Considering all that, my position is that it’s nice if we can approximately batch deprecations and that we should aim to space deprecations and removals to reduce the burden to more passively maintained yosys plugins.

I also consider all of this more of a loose guideline than something we should make a hard rule, e.g. if we’re blocked on removing some API to enable important optimizations or to support an important use case, we shouldn’t have to wait a year between deprecation and removal, but we should take a moment to consider whether there’s anything else we should deprecate while we’re at it.