Invalid xml output generated when code contains functions with string arguments
For the attached code (topmod.sv), verilator (4.014) is generating invalid xml output (attached Vtopmod.xml).
The run command:
verilator_bin -sv topmod.sv --xml-only --Mdir verilator_output --top-module topmodOn line 52 of file Vtopmod.xml, the xml statement reads
<const fl="d20" name="\"invalid input during reset\"" dtype_id="2"/>XML format needs
'"'be used for quotes and not
I took a shot at root-causing the issue and my investigation leads me to believe that the culprit is the putsQuoted() function in V3EmitXml.cpp which has a call
putsNoTracking(V3Number::quoteNameControls(str));The V3Number::quoteNameControls function uses C++ style of escaping special characters, and hence doesn't work with XML.
#3 Updated by Wilson Snyder 4 months ago
- Category set to WrongRuntimeResult
- Assignee set to Kanad Kanhere
Thanks for taking this on & asking.
V3Number::quoteNameControls really is in the wrong place, I would suggest first make a patch that moves it to be a static function in V3OutFormatter::quoteNameControls that takes a language argument. (It looks like it's called from places where there isn't a File object to taking the language as parameter is easiest vs. making it a member function and having to construct lots of temp classes.) All tests should pass as-is.
Then second patch would change this function for XML - please also update the e.g. test_regress/t/t_xml_tag.v test to exercise this case.
#4 Updated by Kanad Kanhere 4 months ago
First patch to relocate quoteNameControls function from V3Number to V3OutFormatter is attached.
Please let me know if it needs correction/modification.
#8 Updated by Kanad Kanhere 4 months ago
Do you mean U+0005 control character? My understanding is that it is not allowed in XML 1.0 [https://en.wikipedia.org/wiki/Valid_characters_in_XML#XML_1.0] Or do you mean non-printable characters should be spelt out in XML.
Basically can you give a transfer function from ascii-code to xml string and I can certainly update the function for that.
#10 Updated by Kanad Kanhere 4 months ago
Apologies for the delay. I have attached the latest patch for the fix. The change now handles all ascii values (except for the null character). I have also updated the test to check this.
NOTE: the 3.2.1 lxml python library didn't like ascii values 1-8,11,12,14-31. It complained about invalid character (e.g. "invalid xmlChar 2, ...")
Also available in: Atom