Project

General

Profile

[logo] 
 
Home
News
Activity
About/Contact
Major Tools
  Dinotrace
  Verilator
  Verilog-mode
  Verilog-Perl
Other Tools
  BugVise
  CovVise
  Force-Gate-Sim
  Gspice
  IPC::Locker
  Rsvn
  SVN::S4
  Voneline
  WFH
General Info
  Papers

Issue #1490

Add an option to create a DPI protected library

Added by Todd Strader 29 days ago. Updated about 7 hours ago.

Status:
Assigned
Priority:
Normal
Assignee:
Category:
Unsupported
% Done:

0%


Description

Jumping off from here: https://www.veripool.org/boards/3/topics/3037

I've got a WIP branch to show where I'm going: https://github.com/toddstrader/verilator-dev/tree/dpi-compile This adds a --dpi-protect option to Verilator which causes it to emit the wrapper SV and C++ files needed to build the library. It also modifies the emitted Makefile to build the library. Presently, the generated library demonstrates basic functionality when simulated with both Verilator and XSim.

Currently, I'm using my proof-of-concept codebase to test --dpi-protect: https://github.com/toddstrader/dpi-compile

General comments are welcomed, but I do have a couple specific questions:

To build the library, I need to run Verilator again on the wrapper SV file to get the DPI header. I'd like this to be as push-button as possible, so I'm planning to have the Verilator Perl wrapper call the binary again when run with --dpi-protect to build the DPI header. Alternatively, I could add a step to the emitted Makefile to do this, but I think the former is simpler. Thoughts? I also plan to add (via a separate issue) a mode to only emit the DPI header.

Since the Verilator runtime objects need to be linked against this library for it to do anything, we need to handle incompatibilities between different versions of Verilator. I don't believe that code was ever intended to have a stable API, so I think the safest option for now is to wrap each library's (one may have multiple) Verilator runtime in its own namespace. Thoughts? Is there a better way to handle this?

The branch needs a lot more testing and polishing, but I'd like to focus on getting a minimum viable feature in the trunk before adding a ton of additional features (e.g. symbol hashing).

secret.sv - --dpi-protect SV wrapper (6.35 KB) Todd Strader, 09/20/2019 11:58 AM

secret.cpp View - --dpi-protect C++ wrapper (2.71 KB) Todd Strader, 09/20/2019 11:58 AM

History

#1 Updated by Wilson Snyder 25 days ago

  • Category set to Unsupported
  • Status changed from New to Assigned

Will post comments about patch once get chance to review them.

To build the library, I need to run Verilator again on the wrapper SV file to get the DPI header. I'd like this to be as push-button as possible, so I'm planning to have the Verilator Perl wrapper call the binary again when run with --dpi-protect to build the DPI header. Alternatively, I could add a step to the emitted Makefile to do this, but I think the former is simpler. Thoughts? I also plan to add (via a separate issue) a mode to only emit the DPI header.

Why can't we build the dpi header in the same binary run? This seems cleaner for the user.

Since the Verilator runtime objects need to be linked against this library for it to do anything, we need to handle incompatibilities between different versions of Verilator. I don't believe that code was ever intended to have a stable API, so I think the safest option for now is to wrap each library's (one may have multiple) Verilator runtime in its own namespace. Thoughts? Is there a better way to handle this?

Some of the library runtines, e.g. tracing, multithreading and some introspection will only work if there is a single Verilator runtime instance. I suggest we punt the problem of different runtime versions as a future problem to clean up. One possibility would be to make there be a stable API for the shared routines, and make all of the other runtime routines (e.g. most VL_*) be run through a pre-archive linker pass so those routines are removed from being needed in the later final link stage.

#2 Updated by Todd Strader 25 days ago

Why can't we build the dpi header in the same binary run? This seems cleaner for the user.

It would definitely be preferable to get everything you need in one pass from the Verilator binary. Here are the options I see:
  1. Run the binary a second time in the Perl wrapper
  2. Run the binary a second time from the generated Makefile
  3. Add another emitter in V3EmitDpiProtect.cpp similar to EmitCSyms::emitDpiHdr()
  4. Create a new AST and run it through the visitor in V3EmitCSyms.cpp to create the DPI header (this one seems crazy)

I dismissed #3 because it seems weird to create code in Verilator to do something Verilator already does. But it wouldn't be hard. What do you think about #3? Or do you see another, better way to do this?

Some of the library runtines, e.g. tracing, multithreading and some introspection will only work if there is a single Verilator runtime instance. I suggest we punt the problem of different runtime versions as a future problem to clean up. One possibility would be to make there be a stable API for the shared routines, and make all of the other runtime routines (e.g. most VL_*) be run through a pre-archive linker pass so those routines are removed from being needed in the later final link stage.

Yeah, I don't see this as a problem that we need to solve for v0. But for my own edification, could you point out a specific thing that would't work if we just namespaced everything? I would imagine that all classes and static objects would then have the namespace as part of their symbols and not collide with other Verilators. And preprocessor stuff would be gone by the time we got to the .so.

I'm not sure what the performance implications of having multiple copies of the runtime would be (I'm sure it wouldn't make things better). However, the main advantage I see to this approach is that Verilator runtime API version doesn't have to be part of the build matrix that IP vendors would have to wrangle. We've already got OS (both kernel and precision), architecture and C++ ABI. It would be nice to limit the degrees of freedom in that matrix.

#3 Updated by Wilson Snyder 25 days ago

#3 is what I was thinking. Or, ideally refactor the DPI emit code to output something closer to pure Ast, and have V3Task and the new code create that Ast code.

If you namespace everything then any function in Verilated* that is a singleton will now no longer be a singleton but once per namespace. This will break stuff, e.g. tracing, threading, etc. Hence I was suggesting we separate stuff that is truly singleton-required from everything else.

#4 Updated by Todd Strader 22 days ago

Turns out that option #5 is to not need the DPI header at all. I'm not sure why I thought I needed it, but I don't.

Either I don't understand something fundamental about the Verilator singletons or I'm not explaining my namespace thoughts well. Either way, this thread doesn't get us closer to a MVP which is what I'd like to aim for. I'll be back to beat this dead horse later once I've done some more research.

I think the current branch is feature complete but currently missing testing (that's not in my external repo). I'd like to aim to land this once I have t_*pl test(s). Please let me know if you think we'll need anything else before pushing the feature.

#5 Updated by Todd Strader 22 days ago

To be clear, I think the MVP is feature complete (something self-contained that people could try out). There's a lot more to do for the larger feature, but I see those items being broken up into future commits.

#6 Updated by Wilson Snyder 22 days ago

In addition to a test_regress, please add an example/, then refer to it in the bin/verilator docs.

Bunch of TODOs to clean up. Would be good to use "FIXME" so these can't be forgotten.

Wondering why you build a shared object instead of a static library (.a)?

One of the things that bothers me with V3EmitC is the hacks used while printing code, versus having a transformation that basically makes an almost pure C++ Ast, then the emitter just prints the Ast. While this is a bit more complicated it helps code reuse and allows us to add transformations later. I think we should consider having your code make an Ast that EmitV and EmitC then can print out.

+=item --dpi-protect I<name>

Please describe what the <name> argument does.

Nitty stuff:

+  protected:
+    // TODO -- inouts?
+    typedef std::list&lt;AstVar*&gt; VarList;
+    VarList m_inputs;
+    VarList m_outputs;
+    int m_totalPorts;
+    AstNodeModule* m_modp;
+    string m_libName;

Please put members first (with // MEMBERS), then constructors (with // CONSTRUCTOS) then methods.

Need comment on every member. (Elsewhere too).

+  private:
+    void discoverPorts() {
+        AstNode* stmtp = m_modp->stmtsp();
+        while(stmtp) {
...
+            stmtp = stmtp->nextp();

Use a for loop? Space after "while".

+                    varp->v3fatalSrc("Unsupported direction for --dpi-protect: "<<
+                                     varp->direction().ascii());

<< goes on beginning of next line.

+class EmitVWrapper: public EmitWrapper {
+  public:
+    EmitVWrapper(AstNodeModule* modp):
+        EmitWrapper(modp),
+        m_of(v3Global.opt.makeDir()+"/"+m_libName+".sv")
+    {

{ goes on previous line.

+    void emitDpiParameters(VarList& ports) {
...
+            if (varp->direction() == VDirection::INPUT) {
+                if ((width = varp->width()) > 1) {
+                    m_of.puts("const svBitVecVal* ");
+                } else {
+                    m_of.puts("unsigned char ");

I suspect this isn't general enough, but the way to check is make sure your test tries every supported data type.

+        m_of.puts("void* create_dpi_prot_"+m_libName+" (const char* scope) {\n");

I think we should follow standardish C library naming conventions, where the library name is a prefix on every C function name.

+            of.puts("lib"+v3Global.opt.dpiProtect()+".so: $(VM_PREFIX)__ALL.a $(VK_GLOBAL_OBJS)\n");
+            of.puts("\t$(OBJCACHE) $(CXX) $(CXXFLAGS) $(CPPFLAGS) $(OPT_FAST) -shared -o $@ "+v3Global.opt.dpiProtect()+".cpp $^\n");

Split lines at the + to keep lines under ~100 chars.

#7 Updated by Todd Strader 18 days ago

Thanks for all the feedback.

Wondering why you build a shared object instead of a static library (.a)?

Because both the ModelSim and XSim documentation mentions loading shared objects. I assumed that they were only able to dlopen() these things, but it seems that XSim is fine either way. I tried to test ModelSim (Intel Starter Edition) but I cannot get their crusty old compiler to work on my system. I don't know about all flavors of ModelSim, but at least for this one we'll need a separate build to be able to handle it. So in and of itself, it's not a reason to go with .so over .a. Do you know if you can link DPI static archives with ModelSim? And do you know how this works on the other major simulators? Agreed that it's nicest just to compile with the archive and not have to worry about runtime loading.

I think we should consider having your code make an Ast that EmitV and EmitC then can print out.

I assume this would be a structure completely separate from the main AST. Is that what you're imagining here?

I'm planning to work on the tests first and then I'll come back to clean up these items.

#8 Updated by Wilson Snyder 18 days ago

I'm unaware of how Modelsim handles DPI.

Was thinking the structure would be part of the AST (as everything tries to be) but under some new top leaf under AstNetlist perhaps.

#9 Updated by Todd Strader 17 days ago

More for my own notes than anything else: running XSim with a static DPI library is not as simple as I had earlier indicated. The simple (non-Verilator) example I used to test .a vs .so works fine. But with the dpi-compile example, XSim gives me this terribly unhelpful error message:
ERROR: [Simtcl 6-50] Simulation engine failed to start: Simulation exited with status code 1.
Please see the Tcl Console or the Messages for details.
There are no other messages to be found. I tried to debug the simulation kernel, and while it's clearly not intended for me to invoke directly I do get this backtrace:
(gdb) bt                                                                                                                      
#0  0x0000555555556970 in DPISetMode@plt ()
#1  0x000055555556bea2 in __gnu_cxx::new_allocator<unsigned int*>::allocate(unsigned long, void const*) ()    
#2  0x000055555556b8b7 in std::allocator_traits<std::allocator<unsigned int*> >::allocate(std::allocator<unsigned int*>&, unsigned long) ()
#3  0x000055555556a5ec in std::_Deque_base<unsigned int, std::allocator<unsigned int> >::_M_allocate_map(unsigned long) ()
#4  0x0000555555568534 in std::_Deque_base<unsigned int, std::allocator<unsigned int> >::_M_initialize_map(unsigned long) ()
#5  0x0000555555566ec8 in std::_Deque_base<unsigned int, std::allocator<unsigned int> >::_Deque_base() ()
#6  0x0000555555565c80 in std::deque<unsigned int, std::allocator<unsigned int> >::deque() ()
#7  0x0000555555564619 in VerilatedImp::VerilatedImp() ()       
#8  0x0000555555563cf0 in __static_initialization_and_destruction_0(int, int) ()                             
#9  0x0000555555563d26 in _GLOBAL__sub_I_verilated.cpp ()
#10 0x000055555556c92d in __libc_csu_init ()
#11 0x00007ffff6ed9b28 in __libc_start_main (main=0x555555557e30 <main>, argc=1, argv=0x7fffffffe0b8, init=0x55555556c8e0 <__libc_csu_init>, fini=<optimized out>, rtld_fini=<optimized out>,
    stack_end=0x7fffffffe0a8) at ../csu/libc-start.c:266
#12 0x000055555555718a in _start ()    

Which is kind of weird because it looks like we're creating the static VerilatedImp object but then while it's setting up a member deque, DPISetMode@plt() is called. However, I don't see DPISetMode in the Verilator codebase.

All this is to say, I'm again unsure if static libraries will be generally usable by other simulators. It clearly works fine for Verilator itself, but other simulators will require further research.

#10 Updated by Todd Strader 1 day ago

More questions about the in-AST refactor:

Currently "--dpi-protect secret" creates foo.sv and foo.cpp. It also slightly modifies the emitted Makefile, but I'm going to assume we're not going to represent the Makefile in the AST, so let's ignore that for now.

I've attached the .cpp and .sv files that are currently generated by t_dpi_prot_secret for reference. To represent the .cpp I'll need to build up AstC* nodes that produce the same result. I thought at first I could build another top-level AstModule so that emitc() would just kick out another .cpp, but that produces a C++ class implementation which is not what I need here.

Next I tried to use EmitCImp::mainDoFunc() to kick out vanilla C functions without the C++ accoutrements. This is what I have here (for reference): https://github.com/toddstrader/verilator-dev/tree/dpi-compile-ast

This ultimately doesn't work either because the EmitCImp AstCFunc visitor assumes that it is dealing with a method for a module:
        puts(modClassName(m_modp)+"::"+nodep->name()
             +"("+cFuncArgs(nodep)+") {\n");

I believe the both the AST nodes and EmitCImp are not abstract enough to represent/emit generic C or C++. This of course makes sense because that's not what Verilator does. But I'm also not sure if trying to create --dpi-protect artifacts in the AST is going to work unless we greatly expand the way we model C/C++ in the AST. And I don't think that's worthwhile for just this one feature.

Wilson, what do you think? Am I misunderstanding something here? Given this do you think it's best to proceed without AST-ifying the --dpi-protect files? That would be my vote.

#11 Updated by Wilson Snyder about 7 hours ago

It also slightly modifies the emitted Makefile, but I'm going to assume we're not going to represent the Makefile in the AST, so let's ignore that for now.

Agreed, or as you say not for now...

thought at first I could build another top-level AstModule so that emitc() would just kick out another .cpp, but that produces a C++ class implementation which is not what I need here.

I appreciate that the current code that prints stuff isn't that complex, but I see this is likely to become a very important feature, and new features will creep in and make it unwieldly, so making something a bit more partitioned will help us. I certainly regret V3Emit doesn't work this way now (i.e. separate transformer and dumper).

More specifically maybe it would be under AstCFile, then maybe that m_modp code can also look at m_cfilep or something.

Perhaps a compromise is your code makes a AstCFile/AstVFile and underneath where there's no existing appropriate Ast, uses a lot of AstText (which just says "put out this code")? That's the hack used currently where I wanted something to go out, but didn't feel worthy of making a real Ast for later processing. Then we can abstract the AstText's away to new Ast types if/when we need to. Suppose that's also how I could start cleaning up V3EmicC.

Likewise for the .v it should be able to make an AstVFile and AST, and later there's a call to EmitV, rather than printing the code directly. It seems like your code has a great deal of duplication of the existing EmitV. (Including being called EmitVWrapper - but never seems to use EmitV.)

Also available in: Atom