Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --protect-ids to enhance --protect-lib obfuscation #1521

Closed
veripoolbot opened this issue Sep 24, 2019 · 9 comments
Closed

Add --protect-ids to enhance --protect-lib obfuscation #1521

veripoolbot opened this issue Sep 24, 2019 · 9 comments
Assignees
Labels
resolution: fixed Closed; fixed type: feature-IEEE Request to add new feature, described in IEEE 1800

Comments

@veripoolbot
Copy link
Contributor


Author Name: Todd Strader (@toddstrader)
Original Redmine Issue: 1521 from https://www.veripool.org

Original Assignee: Todd Strader (@toddstrader)


Signal names and some other design details are visible via library symbols and possibly strings. We can have Verilator obfuscate these names and/or strip symbols while building the library.

@veripoolbot
Copy link
Contributor Author


Original Redmine Comment
Author Name: Wilson Snyder (@wsnyder)
Original Date: 2019-09-24T14:42:20Z


If anyone starts on this let me know, as I at some point might also take a look at it.

@veripoolbot
Copy link
Contributor Author


Original Redmine Comment
Author Name: Wilson Snyder (@wsnyder)
Original Date: 2019-09-26T02:44:53Z


Proposed new flags to implement this feature for discussion:

--protect-symbols = When specified, Verilator will hash any private symbol names (variable, module, and assertion block names that are not on the top level or exposed with --public) into hashed random-looking names, resulting after compilation in protected library binaries that expose less design information. Verilator will also create a {prefix}__symmap.xml file which contains the mapping from the hashed names back to the original names. The symmap file is to be kept private, and is to assist mapping any runtime design or waveform information (which will indicate the hashed names) back to the original design's symbol names for debug. This hashing uses the provided or default --protect-key, see important details there.

--protect-key = For --protect-symbols, the private hash key. This key is used for symbol hashing, such that the same key will produce the same hashed symbol names. For best security this key must be at least 16 bytes, ideal is the output of uuidgen. Typically, a key would be created by the user once for a given protected design element, then each Verilator run for that design would be passed the same --protect-key. Thus, if the input Verilog is similar between Verilator runs, the Verilated code will likewise be mostly similar. If --protect-key is not specified and --protect-symbols is used, Verilator will generate a new key for every Verilator run, and not save the key, which while best for security, means every Verilator run will give different output even for identical input.

@veripoolbot
Copy link
Contributor Author


Original Redmine Comment
Author Name: Todd Strader (@toddstrader)
Original Date: 2019-09-26T04:09:06Z


Do you see uses for this beyond --dpi-protect? I'd imagine, but of course can't say for sure, that IP vendors wouldn't want end users to be able to generate waveforms of their logic regardless of whether the signals are garbled or not. I would instead think they'd want something like #�. Also, I know we're not on the same page wrt bug 1518 yet, but if we end up wrapping each --dpi-protected modules' Verilator runtime in a unique namespace, they wouldn't even be able to dump traces to the same file. And regardless, in the case where someone is running a --dpi-protect'ed module under a non-Verilator simulator, tracing the protected module would be . . . complicated. I guess we could have environment variables tell us where to put the waveform file, but that feels dirty to me.

Also, I don't know enough about the strip utility, but I'm wondering if there are ways outside of Verilator (well, inside the verilated Makefile) to solve the problem of leaking symbols to the end user when under --dpi-protect.

In any case, your proposed options look reasonable in and of themselves so I have no objections to them. I just suggest that we make --dpi-protect work as simply as possible by itself, which in my mind would mean no symbols or meaningless symbols by default. And if there are reasonable ways to do that in the verilated Makefile while building the library I'd suggest that we do that as well.

@veripoolbot
Copy link
Contributor Author


Original Redmine Comment
Author Name: Wilson Snyder (@wsnyder)
Original Date: 2019-09-26T11:55:54Z


Ease of use is a good point so agree --dpi-protect (or whatever renamed) should set --protect-symbols by default; if someone doesn't want this they use --dpi-protect -no-protect-symbols. Two uses of that, first is debugging a model to get it working (and never shipping it). Second some IP customers (my company included) generally require IP providers to not restrict debug as it harms the schedule. So in that case, a model where tracing works but signals are exposed would be contractually required/interesting.

Also some users are using Verilator for embedding in programs they send outside and have requested a option to do this before, and/or are obfuscating before Verilator, so this option has standalone value.

I agree the dpi-protect option should strop by default. I believe a normal "strip" removes information required to link, so you can't do that. There should be some set of strip flags that remove non-required stuff, you might need to experiment a bit.

@veripoolbot
Copy link
Contributor Author


Original Redmine Comment
Author Name: Wilson Snyder (@wsnyder)
Original Date: 2019-10-01T10:51:36Z


Got an experiment in place and realized changing v3Name to make this work
doesn't really work, as it currently adds __PVT because later reporting
just strips that. Replacing with encoded names means all later error
messages use the encoded name, which it almost unusable. Furthermore, some names
like foo__DOT__bar become asdbasd_DOT_asdasdjh, and it pains me to expose
the hierarchy with the dot.

See these routes.

  1. Keep the "string name"'s as is. Encode them in V3Name. When printing
    messages reverse the encoding back to the original. Upside is it protects
    .tree outputs also. Downside will still have names like
    asdbasd_DOT_asdasdjh. Seems hackish.

  2. Convert "string name" to a class (VName) where we track the original
    name and C symbol separately. There's already some classes with origName so
    those would get cleaned up. Might be a lot of work to clean up all
    name()'s to operate appropriately. Code would always need to be properly
    written to know which name is appropriate. Perhaps maintenance pain.

  3. Add a new replacement just before V3Emit. Things like your AstText's
    would need symbols split out (e.g. into AstTextHashable). Ideally V3EmitC
    would change to also make AstTextHashable's so then we just hash all
    AstTextHashables. Not sure how to handle V3EmitMk - perhaps it ideally
    also needs to make these AstTextHashables.

  4. Emitters need to at appropriate point when printing a name() instead
    call into the hash generator, e.g. ->nameHashed(). This seems cleanest
    route and proposed thing to try to code next.

Any thoughts?

@veripoolbot
Copy link
Contributor Author


Original Redmine Comment
Author Name: Todd Strader (@toddstrader)
Original Date: 2019-10-03T09:38:45Z


#4 makes sense to me. Naively, it seems that s/name()/nameHashed()/ in V3EmitC would be pretty close to what we need.

I don't think this will affect V3PotectLib since it is, by definition, only emitting things outside the protected envelope. Of course, if need be I can change the emitter as well.

And you've probably already thought of this, but for completeness I would suggest having a regression test for this feature run "nm -C" and "strings" on compiled artifacts and grep to make sure that we're not leaving anything visible in there.

@veripoolbot
Copy link
Contributor Author


Original Redmine Comment
Author Name: Wilson Snyder (@wsnyder)
Original Date: 2019-10-06T13:30:32Z


Calling it --protect-ids; nearly there.

@veripoolbot
Copy link
Contributor Author


Original Redmine Comment
Author Name: Wilson Snyder (@wsnyder)
Original Date: 2019-10-06T17:25:52Z


Pushed to git towards 4.022.

Todd, you'll want to presumably make this new option also turn on when --protect-lib is used.

Note also the undocumented --debug-protect which renames all symbols by just adding a prefix - makes it easy to debug where a protect() call is missing.

@veripoolbot
Copy link
Contributor Author


Original Redmine Comment
Author Name: Wilson Snyder (@wsnyder)
Original Date: 2019-11-10T19:28:36Z


In 4.022.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
resolution: fixed Closed; fixed type: feature-IEEE Request to add new feature, described in IEEE 1800
Projects
None yet
Development

No branches or pull requests

2 participants