Cleaning up `log_id()` and `RTLIL::unescape_id()`

log_id() is ugly because it returns a const char * and to avoid potential lifetime issues it stores the string in log_id_cache which leaks memory until the end of each pass. Also, it does exactly the same thing as RTLIL::id2cstr (which just wraps log_id()) and apart from the return type, that’s the same thing as RTLIL::unescape_id().

The name log_idisn’t ideal because it doesn’t actually log anything. The name id2cstr isn’t ideal because it does something different from IdString::c_str() (the former unescapes and the latter does not).

Here are some options for fixing this up:

  • Replace all uses of log_id()/RTLIL::id2cstr() with calls to RTLIL::unescape_id().
  • Replace all uses of all three of those with calls to a new (nonstatic) method IdString::unescape().
  • Remove all uses of all three in logging parameters and indicate that we want unescaping by adding something to the format string, e.g. log(“%#s”, id).

I prefer the second option, adding IdString::unescape(): easy to write id.unescape() at call sites and clear in its intent. We can make it return a std::string_view so it won’t make a copy of the string. There is the potential for lifetime issues if, for example, you call a function that returns a new IdString and then call unescape() on that — in a context that’s not a parameter to a logging function or other function call (which would extend the temporary lifetime). I haven’t found any examples of that with the existing functions (i.e. log_id_cache isn’t currently needed), so I don’t think that is going to be a significant problem in practice. By adding a new method, we can keep the existing functions in place for third-party code but deprecated and then remove them after some period of time.

BTW to be clear, we already support passing an IdString directly to log() etc, which prints the escaped ID. I’m assuming we want to keep that as-is.

1 Like

I thought the log("%#s", id) style had shown up in another discussion, but I can’t find it now to see if there was anything said about it. I think even if we did add the format string handling, there could still be some use case for IdString::unescape() in non-logging contexts, so that would also be my vote.

Note that log_id is overloaded for pointers, with log_id(foo) being equivalent to log_id(foo->name). I would prefer if we can avoid replacing every instance of that with foo->name.unescape(). I think a good way for that would be to have the default "%s" formatting for a RTLIL::NamedObject *foo (and compatible) be equivalent to foo->name.unescape(), which means we can just remove the log_id entirely for all of those and then use some_id.unescape() for IdStrings.

I realize that this would introduce a mismatch between unescaping when logging a raw IdString vs a NamedObject*, but I don’t think that’s going to be an issue in practice, since both escaped and unescaped formatting is still available and both options become simpler and less verbose (log_id(some_obj)some_obj and some_obj->name.c_str()some_obj->name).