Cleaning up log sinks

Moving the conversation in Logging: Add log stream that only get sent warning+ logs. by Logikable · Pull Request #5311 · YosysHQ/yosys · GitHub to the Discourse.

Currently, all logs are funnelled to logv_string(), where they are fputs’d to a list of FILE *, and <<'d to a list of ostreams. This post proposes a third type of sink that aims to replace ostream sinks, and ideally the list of FILEs. (I’m aware of xkcd #927).

ostreams are too simple an interface to be a versatile sink. The recipient doesn’t have access to metadata like log level, log time, or the log source location.

There are at least two solutions:

  • Depend on a logger with sink support like spdlog.
  • Implement our own sink system. At its simplest, this would involve a LogMsg struct containing metadata and a LogSink class with a log(LogMsg) method that sinks can override.

I prefer the latter. Thoughts?

To me it makes a lot of sense to extend Yosys with more structured logging capabilities which would include both sinks for structured logging but also ways to emit more information in a structured way.

I do think that it makes sense to go with our own system. That would allow us to add support for the features that are useful to us and at the same time avoid paying for the complexity cost of features that make more sense for continuously running or even distributed systems but aren’t really a great match for our use cases.

For (as far as I know) hypothetical use cases where people would integrate Yosys into such systems, there’s always the option of writing a custom adaptor that is a sink for Yosys logging and a source for whatever logging infrastructure I’m sure such a system would already use.

I do think there are some things to keep in mind to allow a smooth transition. A big part of that is to maintain interoperability on both ends. On the sink side, that means it should be possible to access all structured logging metadata and to access all built in filtering logic from a default build without depending on any external tools or services. In particular,

  • there should be no regression in usefulness of the normal stdout/logfile logging (in fact, I think there are opportunities to improve this)
  • it should be possible to get human readable output that optionally includes the additional metadata on stdout / in logfiles, to avoid adding any friction to debugging logging related issues
  • it probably makes sense to also add machine readable output of this, that we can use to test the new logging infrastructure itself.

On the source side we need to think about how to translate the currently entirely unstructured logging into a more structured form. I think the requirement here should be that things shouldn’t regress even when we don’t make any changes to the log output of existing passes. E.g. I’m not sure how widespread that is, but I know that we have some code that emits what is logically a single log message using multiple log calls, sometimes even splitting the emission of a single line across multiple calls. I think by default the way we adapt pipe-of-characters sources to structured sinks could go with line buffering, that at least wouldn’t regress anything. This would be in addition to extending the logging API to give explicit control over what goes into a single message and what additional structured data to include.

On the structured metadata side, I think it also makes sense to include some Yosys specific metadata on top of always useful information like timestamps and source location. The primary thing I’m thinking of is the ability to associate a log message with specific design elements, and if those design elements have design source location data, also record that at the point of logging. In a best case scenario this is also done in a way that is easy to retrofit, e.g. by providing RAII logging contexts for at least module level granularity, since a lot of passes stay within a single module for most of what they do and the ability to filter logs by module after the fact would already be a significant usability improvement. You could also say, I want us to be able to provide more compiler diagnostic style output in addition or as an alternative to what we can currently do, and I think supporting structured metadata and more flexible sinks could be a great way to retrofit this.

On the filtering side there is some debug log functionality that can be dynamically enabled with the debug command, there are some separate toggles for specific debug logs and a few passes have their own verbose option. I’d love to get this cleaned up and have something that can be configured in a consistent way covering all those cases, but I don’t want to make the cleanup of that an initial requirement, but we should take it into consideration if we add any new filtering features to the logging infrastructure. There is also some builtin log message matching that we use throughout our test suite, this should remain backwards compatible as long as the source side remains untouched.

I don’t think we need to add all of this at once, in fact we should probably move towards this in smaller steps, but if we start adding new APIs for logging, I want to make sure that what we add will move us along a clear path towards improvements that make sense for the project as a whole, and that what we add becomes more or less immediately usable for all users from within yosys itself (unless it is specifically about providing additional external integration of features already available from within yosys).

I agree with all of this. The LogMessage class proposed in Logging: Add log sink system to replace ostreams by Logikable · Pull Request #5311 · YosysHQ/yosys · GitHub can act as the basis for more structured logging. It’ll be easy to add the Yosys metadata you proposed (module, associated design elements) to this either directly, or through a new Context class that is populated by Yosys modules and passed to the Logger. Then, a LogMessage is created whenever log() is called.

I think next steps include turning ostream and FILE sinks into LogSinks. The custom LogSink class introduced in the above PR is strictly more powerful than either ostreams or FILEs, so I don’t expect any regressions. This will greatly simplify the interface that the source needs to support, and all sinks will have access to the same structured metadata to filter on / emit.