Project

General

Profile

[logo] 
 
Home
News
Activity
About/Contact
Major Tools
  Dinotrace
  Verilator
  Verilog-mode
  Verilog-Perl
Other Tools
  BugVise
  CovVise
  Force-Gate-Sim
  Gspice
  IPC::Locker
  Rsvn
  SVN::S4
  Voneline
  WFH
General Info
  Papers

Why is my generated archive file so huge?

Added by David Banas 2 months ago

I have a case in which I'm able to Verilate <subblock> with little trouble (< 2 min., < 1 Mbyte). Then, I try to Verilate its parent, which only instantiates 80 <subblock> s, and I end up with an archive file > 4 Gbytes!

I should mention that I'm attempting some trickery in the second case: instead of using the normal RTL suite for <subblock>, I'm making use of the previous Verilation of <subblock>, by writing a "shim", which stands in for the normal top-level SystemVerilog file for <subblock> and uses DPI calls to call functions in a DLL produced by the (hacked) Verilation of <subblock>.

(I'm trying with the normal RTL for <subblock> now, to see if my trickery is the cause of the size explosion of my archive file.)

Here're some excerpts from the V<parent>__stats.txt file. Does anyone see anything out of the ordinary? Are those two extreme jumps in the memory footprint typical?

Thanks! -db

Verilator Statistics Report

Information:
  Verilator 4.008 2018-12-01 rev UNKNOWN_REV
  Arguments: --cc --exe --threads 40 --no-decoration
-Wall -Wno-fatal -Wno-declfilename -Wno-pinconnectempty
-Wno-implicit -Wno-unused -Wno-caseincomplete -Wno-undriven
-Wno-width -Wno-BLKANDNBLK -Wno-MULTIDRIVEN
-Wno-PINMISSING -Wno-UNOPTFLAT -Wno-COMBDLY
--Mdir verilator/out -y verilator/src -y ...
-O3 -CFLAGS -O3 -g -fPIC -std=gnu++14
--output-split 20000 --output-split-cfuncs 50000
--stats --top-module <parent> <parent>_tb.cpp

Global Statistics:

  Warnings, Suppressed BLKANDNBLK                         40
  Warnings, Suppressed CASEINCOMPLETE                      6
  Warnings, Suppressed COMBDLY                          5306
  Warnings, Suppressed DECLFILENAME                        1
  Warnings, Suppressed IMPERFECTSCH                       31
  Warnings, Suppressed IMPLICIT                          109
  Warnings, Suppressed PINCONNECTEMPTY                     7
  Warnings, Suppressed PINMISSING                          1
  Warnings, Suppressed UNDRIVEN                           12
  Warnings, Suppressed UNOPTFLAT                           4
  Warnings, Suppressed UNUSED                            154
  Warnings, Suppressed WIDTH                              11

  Stage, Memory (MB), 006_link                     19.589844
  Stage, Memory (MB), 007_param                    303.316406

  Stage, Memory (MB), 020_unknown                  852.066406
  Stage, Memory (MB), 021_inline                   18758.488281

Stage Statistics:
  Stat                                         Link       PreOrder   Scoped     Final      Final_Fast
  --------                                     -------    -------    -------    -------    -------

  Branch prediction,                               173        484       3528       2303       2105
  Branch prediction, VL_UNLIKELY                                                    561

  Instruction count, TOTAL                       31868  237716833  238743616  522710063  272565131
  Instruction count, fast critical                   0       3920  128064709  273511569  264660460

Replies (25)

RE: Why is my generated archive file so huge? - Added by Wilson Snyder 2 months ago

It looks like it inlining some modules and making a huge expansion. It shouldn't be inlining if it expands that much, might be due to your hacking or some bug.

RE: Why is my generated archive file so huge? - Added by Wilson Snyder 2 months ago

There are several controls you can use to control inlining, see the manual. I'd probably first try in your hack submodule a /*verilator no_inline_module*/

RE: Why is my generated archive file so huge? - Added by David Banas 2 months ago

Thanks, Wilson!

Where, exactly, does the /*verilator no_inline_module*/ directive go?

RE: Why is my generated archive file so huge? - Added by Wilson Snyder 2 months ago

It's under IEEE non_port_module_item, that is next to a non-generate "wire" statement.

RE: Why is my generated archive file so huge? - Added by David Banas 2 months ago

Sorry, is it supposed to be attached to the module definition, or the module instantiation?

RE: Why is my generated archive file so huge? - Added by David Banas 2 months ago

I just noticed the /*verilator public_module*/ directive, which also appears to affect module inlining. Can you tell me how this directive relates to the /*verilator no_inline_module*/ directive?

RE: Why is my generated archive file so huge? - Added by Wilson Snyder 2 months ago

The definition of the module that instantiations of should not be inlined.

Don't use public, it was before there was a DPI and has other slowdown effects you won't want.

RE: Why is my generated archive file so huge? - Added by David Banas 2 months ago

Would you expect this placement of the directive to work?

module myMod #(
  {snip}
) (
  {snip}
);

/*verilator no_inline_module*/

{snip}

endmodule

RE: Why is my generated archive file so huge? - Added by Wilson Snyder 2 months ago

Is it not working? If you get a V*_myMod*.cpp file (and you module is non-trivial) then it's working. If your object is still huge something else is going on, run with --debug and see where the .tree files get huge (use a scratch disk, they can be big).

RE: Why is my generated archive file so huge? - Added by David Banas 2 months ago

Thanks, Wilson!

Yes, since introducing the /*verilator no_inline_module*/ directive, I am getting a V<myBlock>_<subBlock>.cpp file that I was not getting before introducing that directive.

So, I guess the blow-up I'm seeing in the inlining phase has nothing to do with my <subBlock>.

I re-ran with the --debug option, as you suggested, and am seeing the same sort of blow-up in the size of the various *.tree files, starting with the inlining phase:

-rw-r--r--. 1 dbanas staff    2290308 Feb 12 08:20 /tmp/dbanas/vrltr_out/<myBlock>_002_cells.tree
-rw-r--r--. 1 dbanas staff    3434915 Feb 12 08:20 /tmp/dbanas/vrltr_out/<myBlock>_007_link.tree
-rw-r--r--. 1 dbanas staff         95 Feb 12 08:20 /tmp/dbanas/vrltr_out/<myBlock>_009_paramlink.tree
-rw-r--r--. 1 dbanas staff  148327631 Feb 12 08:21 /tmp/dbanas/vrltr_out/<myBlock>_011_width.tree
-rw-r--r--. 1 dbanas staff   80936104 Feb 12 08:21 /tmp/dbanas/vrltr_out/<myBlock>_013_const.tree
-rw-r--r--. 1 dbanas staff         97 Feb 12 08:21 /tmp/dbanas/vrltr_out/<myBlock>_014_assertpre.tree
-rw-r--r--. 1 dbanas staff         97 Feb 12 08:21 /tmp/dbanas/vrltr_out/<myBlock>_015_assert.tree
-rw-r--r--. 1 dbanas staff   29553181 Feb 12 08:21 /tmp/dbanas/vrltr_out/<myBlock>_016_const.tree
-rw-r--r--. 1 dbanas staff   32724472 Feb 12 08:22 /tmp/dbanas/vrltr_out/<myBlock>_019_begin.tree
-rw-r--r--. 1 dbanas staff   46060221 Feb 12 08:22 /tmp/dbanas/vrltr_out/<myBlock>_020_tristate.tree
-rw-r--r--. 1 dbanas staff   45970217 Feb 12 08:22 /tmp/dbanas/vrltr_out/<myBlock>_021_unknown.tree
-rw-r--r--. 1 dbanas staff 5681005125 Feb 12 08:30 /tmp/dbanas/vrltr_out/<myBlock>_022_inline.tree
-rw-r--r--. 1 dbanas staff 5440698297 Feb 12 08:41 /tmp/dbanas/vrltr_out/<myBlock>_024_const.tree
-rw-r--r--. 1 dbanas staff 5391825213 Feb 12 08:49 /tmp/dbanas/vrltr_out/<myBlock>_025_deadDtypes.tree
-rw-r--r--. 1 dbanas staff 5391985253 Feb 12 08:57 /tmp/dbanas/vrltr_out/<myBlock>_026_inst.tree
-rw-r--r--. 1 dbanas staff 5391981645 Feb 12 09:06 /tmp/dbanas/vrltr_out/<myBlock>_027_const.tree
-rw-r--r--. 1 dbanas staff 9230495789 Feb 12 09:20 /tmp/dbanas/vrltr_out/<myBlock>_028_scope.tree
-rw-r--r--. 1 dbanas staff 9099576621 Feb 12 09:35 /tmp/dbanas/vrltr_out/<myBlock>_029_linkdot.tree
-rw-r--r--. 1 dbanas staff         97 Feb 12 09:37 /tmp/dbanas/vrltr_out/<myBlock>_030_const.tree
-rw-r--r--. 1 dbanas staff 9099575868 Feb 12 09:51 /tmp/dbanas/vrltr_out/<myBlock>_031_deadDtypesScoped.tree
-rw-r--r--. 1 dbanas staff 9353181316 Feb 12 10:05 /tmp/dbanas/vrltr_out/<myBlock>_032_case.tree
-rw-r--r--. 1 dbanas staff 9350713949 Feb 12 10:19 /tmp/dbanas/vrltr_out/<myBlock>_034_task.tree
-rw-r--r--. 1 dbanas staff 9390785523 Feb 12 10:35 /tmp/dbanas/vrltr_out/<myBlock>_036_unroll.tree
-rw-r--r--. 1 dbanas staff         97 Feb 12 10:37 /tmp/dbanas/vrltr_out/<myBlock>_037_slice.tree
-rw-r--r--. 1 dbanas staff 8928805993 Feb 12 10:51 /tmp/dbanas/vrltr_out/<myBlock>_038_const.tree
-rw-r--r--. 1 dbanas staff 1810864268 Feb 12 10:55 /tmp/dbanas/vrltr_out/<myBlock>_039_life.tree

When I look at the <myBlock>_022_inline.tree file, I see that the inlining begins with the very top-level module:

Verilator Tree Dump (format 0x3900) from <e19646048> to <e42575550>
     NETLIST 0x2189cf0 <e1> {a0}
    1: MODULE 0x75a1e0f0 <e15228018> {j7}  TOP_<myBlock>  L1 [P]
    1:1: CELLINLINE 0x474f07480 <e20795021#> {j7}  <myBlock> -> <myBlock>
    1:1: CELLINLINE 0x2a5440c40 <e20795023#> {j997}  <myBlock>__DOT__rtl -> __BEGIN__
    1:1: CELLINLINE 0x2a5440d40 <e20795025#> {j1535}  <myBlock>__DOT__rtl__DOT__feat -> __BEGIN__
    1:1: CELLINLINE 0x2a5440e40 <e20795027#> {j1536}  <myBlock>__DOT__rtl__DOT__feat__BRA__0__KET__ -> __BEGIN__
    1:1: CELLINLINE 0x2a5440f40 <e20795029#> {j1536}  <myBlock>__DOT__rtl__DOT__feat__BRA__0__KET____DOT__chan -> __BEGIN__

Do you think that flagging my top-level module with the /*verilator no_inline_module*/ directive might help?

RE: Why is my generated archive file so huge? - Added by David Banas 2 months ago

Hi Wilson,

I just spotted something suspicious and would like your opinion. I find this code in my top level module definition:

for(genvar i=0; i<M; i++) begin: r
  for(genvar j=0; j<N; j++) begin: c
    dReg x_reg (
      .clk (clk_out),
      .d   (i),
      .q   (o)
    );
  end
end

And I note a 1:1: CELLINLINE ... line in my ..._inline.tree file for every iteration of the nested generate loops in the code above.

The clk_out signal comes from a gating clock buffer.

I'm wondering: might this generated clock feeding into a nested generate loop be causing my blow-up?

Thanks!
-db

RE: Why is my generated archive file so huge? - Added by Wilson Snyder 2 months ago

Unless dReg is huge it's likely fine. Look at the generated .tree files and see what lines are getting generated most often. For example, write a little program to take the inline.tree file and keep only the letters and numbers in braces "{[a-z]+[0-9]+}" on each line then pipe through a "sort|uniq -c" to see which have highest count. Then the number in the braces is the line number. To decode the letter in the braces to a filename, run with --xml and see the "file id=" for the given letter.

RE: Why is my generated archive file so huge? - Added by David Banas 2 months ago

Did you mean: --xml-only?

RE: Why is my generated archive file so huge? - Added by David Banas 2 months ago

Hi Wilson,

I tried your advice (I think) and it seems to have had the opposite of the intended effect:

in-lining? # of C++ files generated Total lines of C++ code
yes 265 2,271,569
no 345 21,128,337

Whoa! what happened? I thought the whole point of introducing those /* verilator no_inline_module */ directives was to reduce the total number of C++ lines generated, via reuse of those (previously in-lined) modules.

Any thoughts?

Thanks!
-db

For the record, here's what I did:

So, I followed his advice:

$ sed -n 's/.*\({.*}\).*/\1/p' <block>_022_inline.tree | sort | uniq -c | sort -g
 {snip; much repetition of two-letter codes below.}
 632307 {bc43}
 650244 {cq10}
 944540 {co10}
2634246 {cy26}

Then, using the <block>.xml file from my --xml-only run, I correlated the 4 two-letter labels above to filenames:

    <file id="bc" filename="<file1>" language="1800-2017"/>
    <file id="co" filename="<file2>" language="1800-2017"/>
    <file id="cq" filename="<file3>" language="1800-2017"/>
    <file id="cy" filename="<file4>" language="1800-2017"/>

And, in each of those 4 files, I added the /* verilator no_inline_module */ directive, like so:

module <mod1> (
  {snip I/O defs.}
  /* verilator no_inline_module */

I then launched a non---debug, non---xml-only (i.e. - a "normal") run.

Well, it's the next morning and I see the penalty one pays for adding all these /* verilator no_inline_module */ directives. It's still running!

The last thing I did before launching the run was edit the makefile:

$ ll ../makefile
-rw-r--r-- 1 dbanas 8.8K Feb 12 20:33 ../makefile

And the compilation started at 02:32 this morning.

That means the translation took: 26:32 - 20:33 = 6 hours! Or, 24x longer than the 15 min. that it had been taking, before I added those /* verilator no_inline_module */ directives.

RE: Why is my generated archive file so huge? - Added by David Banas 2 months ago

Hi Wilson,

I was just confirming that these 4 modules, for which I've disallowed in-lining, are getting their own C++ files; they are. However, I noticed something odd: while 3 of them are getting the usual 2 C++ files (one fast, one slow), the fourth is getting 60 C++ files! I find this really strange, because the Verilog source code for this module is really simple:

// dbanas-2019_02_12: Overidden, to supply 'no_module_inline' pragma.

module udp_dff (out, in, clk, clr_, set_, NOTIFIER);
   output out; reg out;
   input  in, clk, clr_, set_, NOTIFIER;

   /* verilator no_inline_module */

    always @(posedge clk, negedge clr_, negedge set_)
    begin
      if ( ~set_ )
        out <= 1'b1 ;
      else if ( ~clr_ )
        out <= 1'b0 ;
      else
        out <= in ;
    end
endmodule

When I look at the first and last of these 60 C++ files, it appears to me that each instantiation of this very simple flop is being given its own class method. Is that how this is supposed to work? I would've thought that the intent here was the opposite: reuse of a single class method.

Can you help me grok this, please?

Thanks!
-db

RE: Why is my generated archive file so huge? - Added by Wilson Snyder 2 months ago

You have modules which are parameterized so they get expanded. If inlined they then get compressed away.

Did you try the experiment I suggested to find the expanded line numbers?

RE: Why is my generated archive file so huge? - Added by David Banas 2 months ago

Sorry, I'm confused: I don't see any parameters in the udp_dff module above.

Yes, I did try your proposed experiment.
My full write-up is contained in my second-to-last comment above.

Thanks,
-db

RE: Why is my generated archive file so huge? - Added by Wilson Snyder 2 months ago

So what is the code corresponding to file4 line 26?

RE: Why is my generated archive file so huge? - Added by David Banas 2 months ago

Line 26 in the original source file is the module inputs declaration statement.

`define ARM_PROP_DELAY 0.0
`define ARM_PERIOD 0.1
`define ARM_WIDTH 0.03
`define ARM_SETUP_TIME 0.01
`define ARM_HOLD_TIME 0.01
`define ARM_RECOVERY_TIME 0.01
`define ARM_REMOVAL_TIME 0.01
`ifdef IBM_VLOG_10ps
`timescale 1 ns / 10 ps
`else
`timescale 1 ns / 1 ps
`endif
`celldefine
module MUX4_X1N_A9PP84TL_C16 (Y, A, B, C, D, S0, S1);
output Y;
input A, B, C, D, S0, S1;
  MUX41_UDP u0(Y, S0, S1, A, C, B, D);
specify
if (B==1'b1 && C==1'b1 && D==1'b1 && S0==1'b0 && S1==1'b0)
(A => Y) = (`ARM_PROP_DELAY,`ARM_PROP_DELAY);
{snip many more "if ..." statements of the same form}
endspecify
endmodule // MUX4_X1N_A9PP84TL_C16
`endcelldefine

RE: Why is my generated archive file so huge? - Added by Wilson Snyder 2 months ago

And you have 2.6 million of those muxes? I can see why turning off the inliner will make that very slow. What's the code in MUX41_UDP?

RE: Why is my generated archive file so huge? - Added by David Banas 2 months ago

// MUX41_UDP.v - Behavioral alternative to original 4:1 MUX, written using UDP tables.

module MUX41_UDP (MUXOUT, SEL0, SEL1, DATA0, DATA1, DATA2, DATA3);
  output MUXOUT;
  input SEL0;
  input SEL1;
  input DATA0;
  input DATA1;
  input DATA2;
  input DATA3;

  /* verilator no_inline_module */

  always @* begin
    case (SEL0)
      1'b0: begin
        case (SEL1)
          1'b0:    MUXOUT = DATA0;
          default: MUXOUT = DATA1;
        endcase // case (SEL1)
      end

      default: begin
        case (SEL1)
          1'b0:    MUXOUT = DATA2;
          default: MUXOUT = DATA3;
        endcase // case (SEL1)
      end
    endcase // case (SEL0)
  end
endmodule

RE: Why is my generated archive file so huge? - Added by Wilson Snyder 2 months ago

You certainly don't want to no-inline-module that or any small primitive modules.

Use this code instead, might be much smaller.

wire MUXOUT = SEL0 ? (SEL1 ? DATA0 : DATA1) : (SEL1 ? DATA2: DATA3);

RE: Why is my generated archive file so huge? - Added by David Banas 2 months ago

Hi Wilson,

Thanks so much for all the time you've given this!

I'm trying to wrap up my report on my experiment with Verilator, and am realizing that there's one major loose end still unexplained. I've quoted it below. (It's about midway up in this thread.) And I'm wondering: do you have any hunches about why this one (of four) simple primitive module is blowing up into 60 C++ files, while the other 3 are resulting in the more usual two files (one fast, one slow)?

Thanks!
-db

David Banas wrote:

Hi Wilson,

I was just confirming that these 4 modules, for which I've disallowed in-lining, are getting their own C++ files; they are. However, I noticed something odd: while 3 of them are getting the usual 2 C++ files (one fast, one slow), the fourth is getting 60 C++ files! I find this really strange, because the Verilog source code for this module is really simple:

[...]

When I look at the first and last of these 60 C++ files, it appears to me that each instantiation of this very simple flop is being given its own class method. Is that how this is supposed to work? I would've thought that the intent here was the opposite: reuse of a single class method.

Can you help me grok this, please?

Thanks!
-db

RE: Why is my generated archive file so huge? - Added by Wilson Snyder 2 months ago

Each module usage needs code to implement each mux, so I suspect that is what is going on. Again this sould be inlined - maybe you need to force inlining? E.g. verilator inline_module.

    (1-25/25)