For some of my work I’m loading very large designs in .rtlil format. One multi-GB file I have takes 76s in read_rtlil. I looked at some profiles and there’s a fair bit of headroom. For example there’s a strdup() for most tokens, everything is char* so we don’t know the lengths of strings, and generally there’s lots of copying. Part of the problem is just using Flex/Bison — they are pretty C-centric and generally difficult to work with. In particular for strings, constants and IDs we have to lex them in one pass and then parse+convert them in a separate pass, which is inherently slow.
So as an experiment I hacked together a new parser from scratch — handwritten recursive descent with integrated lexing and parsing.
With a few performance tweaks, including a few (compatible) changes to RTLIL APIs, I got my multi-GB file down to 30s in read_rtlil_fast (2.5x). The profile is pretty clean now; over 50% is just hashing (and rehashing) strings for the IdString table. End-to-end read+dump of jpeg.synth.il goes from 84ms to 63ms (less of a speedup because it includes Yosys startup and dump).
The overall LoC is about the same as the current parser. It’s a little less readable because we don’t have the grammar as nicely expressed, but I find this style of parser actually a lot easier to understand and debug overall, because it’s just code that any C++ programmer can understand, especially if you’re not already familiar with Flex/Bison.
I’m not sure how important RTLIL parsing performance is in practice and this change might be a bit scary, so I’m not sure how to proceed. I can clean this up and submit a PR that either keeps it as a separate read_rtlil_fast command or replaces read_rtlil. Otherwise I can just use use it locally. Let me know!
(Note that it’s not really tested and probably has some bugs; obviously I’d address that.)