Project

General

Profile

[logo] 
 
Home
News
Activity
About/Contact
Major Tools
  Dinotrace
  Verilator
  Verilog-mode
  Verilog-Perl
Other Tools
  BugVise
  CovVise
  Force-Gate-Sim
  Gspice
  IPC::Locker
  Rsvn
  SVN::S4
  Voneline
  WFH
General Info
  Papers

Issue #1444

Invalid xml output generated when code contains functions with string arguments

Added by Kanad Kanhere 4 months ago. Updated 3 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
WrongRuntimeResult
% Done:

0%


Description

For the attached code (topmod.sv), verilator (4.014) is generating invalid xml output (attached Vtopmod.xml).

The run command:

verilator_bin -sv topmod.sv --xml-only --Mdir verilator_output --top-module topmod

On line 52 of file Vtopmod.xml, the xml statement reads
<const fl="d20" name="\"invalid input during reset\"" dtype_id="2"/>
XML format needs '&quot;' be used for quotes and not '\"'

I took a shot at root-causing the issue and my investigation leads me to believe that the culprit is the putsQuoted() function in V3EmitXml.cpp which has a call

putsNoTracking(V3Number::quoteNameControls(str));
The V3Number::quoteNameControls function uses C++ style of escaping special characters, and hence doesn't work with XML.

topmod.sv (576 Bytes) Kanad Kanhere, 05/16/2019 03:39 PM

Vtopmod.xml View (2.92 KB) Kanad Kanhere, 05/16/2019 03:39 PM

0001-relocate-quoteNameControls-function-from-V3Number-to.patch View (6.05 KB) Kanad Kanhere, 05/16/2019 10:54 PM

0001-Fix-for-issue-1444.patch View (6.78 KB) Kanad Kanhere, 05/17/2019 06:54 PM

0001-Fix-for-issue-1444.patch View (8.6 KB) Kanad Kanhere, 05/29/2019 04:00 PM

History

#1 Updated by Wilson Snyder 4 months ago

  • Status changed from New to Confirmed

Your diagnosis sounds reasonable, perhaps you could provide a patch? Probably V3File's OutFormatter is the right spot to quote, or perhaps V3EmitXml.

#2 Updated by Kanad Kanhere 4 months ago

I can work on a fix. What I am not sure is whether
  • Add language argument to V3Number::quoteNameControls function
  • Create a new function in V3OutputFormatter - but this sounds like logic duplication

#3 Updated by Wilson Snyder 4 months ago

  • Category set to WrongRuntimeResult
  • Assignee set to Kanad Kanhere

Thanks for taking this on & asking.

V3Number::quoteNameControls really is in the wrong place, I would suggest first make a patch that moves it to be a static function in V3OutFormatter::quoteNameControls that takes a language argument. (It looks like it's called from places where there isn't a File object to taking the language as parameter is easiest vs. making it a member function and having to construct lots of temp classes.) All tests should pass as-is.

Then second patch would change this function for XML - please also update the e.g. test_regress/t/t_xml_tag.v test to exercise this case.

#4 Updated by Kanad Kanhere 4 months ago

First patch to relocate quoteNameControls function from V3Number to V3OutFormatter is attached.

Please let me know if it needs correction/modification.

Regards

#5 Updated by Wilson Snyder 4 months ago

0001 looks good, I just changed some spacing to bring up to newer standard (no tabs). Pushed to git in 4.015.

#6 Updated by Kanad Kanhere 4 months ago

Patch for the fix is attached. Includes test update.

#7 Updated by Wilson Snyder 4 months ago

Please add a \005 to the test and I suspect you'll see we need to also ampersand-number quote non-ascii similar to the C version.

We probably could use the &number style for special characters too, but people might find that odd. Either way.

#8 Updated by Kanad Kanhere 4 months ago

Do you mean U+0005 control character? My understanding is that it is not allowed in XML 1.0 [https://en.wikipedia.org/wiki/Valid_characters_in_XML#XML_1.0] Or do you mean non-printable characters should be spelt out in XML.

Basically can you give a transfer function from ascii-code to xml string and I can certainly update the function for that.

Regards

#9 Updated by Wilson Snyder 4 months ago

From https://www.liquid-technologies.com/xml/escapingdata.aspx

says you can character escape unicode format in decimal (better) or hex, so  (ampersand pound 5 semi)

#10 Updated by Kanad Kanhere 4 months ago

Apologies for the delay. I have attached the latest patch for the fix. The change now handles all ascii values (except for the null character). I have also updated the test to check this.

NOTE: the 3.2.1 lxml python library didn't like ascii values 1-8,11,12,14-31. It complained about invalid character (e.g. "invalid xmlChar 2, ...")

#11 Updated by Wilson Snyder 4 months ago

  • Status changed from Confirmed to Resolved

Great, thanks for the effort on this.

Pushed to git towards 4.016.

#12 Updated by Wilson Snyder 3 months ago

  • Status changed from Resolved to Closed

In 4.016.

Also available in: Atom