New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Continuous Integration #1468
Comments
Original Redmine Comment Aaaand, I just noticed: That's pretty timely. I had originally thought it would be easier to piggyback on external projects' existing CI testing (e.g. add a head-to-head build in addition to using a fixed Verilator version). But doing it this way would certainly require less (read: no) buy-in from others. Once we settle on what to do with the main repo I can kick things around with CI on verilator_ext_tests. So, adding to the future work list:
|
Original Redmine Comment https://github.com/verilator/verilator can be now used instead of git.veripool.org/git/verilator, they are identical. Eventually github might become the documented one. BTW I setup ext_tests as having verilator separate as that was better for my workflow, if this is painful for travis, verilator could be an submodule underneath it. As to the patch: I'm still using 3.81 on one system, so think it's unintentional if later than 3.81 is needed, what broke? +script: ./configure && make -j 2 && make test && cd test_regress && ./driver.pl --j 0 I think this should use the Ubuntu maintainer flag and extended tests, configure with --enable-maintainer-mode --enable-long-tests. Any reason not to use just "make -j 3", usually one more than number of CPUs works better, as IO bound. As you noted we can have more systems do testing. Perhaps add to makefile a test_regress_dist/_vlt/_vltmt that just pass the flags "--dist", "--vlt", and "--vltmt" to the driver respectively as these are very crude thirds. Please add ccache. |
Original Redmine Comment
It works locally on my machine and on newer Travis images, both which have make 4.1. And when I upgraded the Trusty image's make this went away there as well. That's as far as I debugged this one.
However, the issue here is that while you can make stages with dependencies in Travis, the disk doesn't persist between stages. At least not in the open source service. That means that if we split up the tests like this, we'll have to rebuild Verilator three times as much as necessary. That's fine-ish if we're only running one OS + compiler. But it would definitely be worse if we build additional OSes/compilers. I'd suggest using the limited parallelism for that long-term. But it's pretty simple, so I can do the crude test split for the short-term.
|
Original Redmine Comment I would hope that the ccache is persistent across their farm, or at least it should, otherwise I don't see why they would support it at all (versus users just enabling it outside their travis config file). I disabled the t_flag_csplit test when an old make is present. |
Original Redmine Comment Hi, I just wanted to point out our ongoing activities of LibreCores CI. It is a jenkins-based CI system we are continuously (but a bit slowly) working on specifically for silicon projects. In case you see problems with Travis, we are happy to help! Cheers, |
Original Redmine Comment Woah. I assumed that you couldn't share state between instances because: But I guess not: This could change the parallelism calculus. I wonder if we can force different compilers etc. to start in parallel and then have subsequent test slices run after the build is cached. Something to explore . . . I'm still trying to get the cache to work in Travis. I see ccache build the cache and Travis uploads it somewhere, but subsequent runs don't seem to pull it down. Still investigating this. I also think I misunderstood your comment about Makefile changes for the test slices. I'm guessing you'd like to avoid cd'ing to test_regress and just "make test" instead. That's in there now. What is --enable-maintainer-mode supposed to do? When I run "./configure --help" I see:
And when I ./configure both with and without this option, the resulting Makefiles are the same. I wonder if I don't have something set up correctly. --vltmt is a bit of an issue. It's currently hardcoded for three threads, but we only have two cores. I modified driver.pl to set this to the max of three and the machine's cores (i.e. two under Travis). First off, I don't know if one less thread is going to break any coverage that's currently in there. But beyond that, a number of the tests break now saying:
which is sort of weird because I decreased the thread count. I can look into this, but any suggestions you have would be appreciated. Travis log here: The current state of the world is here: |
Original Redmine Comment I guess I just wasn't patient enough. I was looking for the cache to be shared between the test slices (I had one run after the first one completed). But the key they store the caches on must include the environment because it is different between the three test slices. However, when I run the same test slice again, it has the same key and it picks up the cache. The time for build + --dist tests went from 8:40 to 3:00. For --vlt and --vltmt it looks like we're getting close to the maximum cache size they'll give us:
I'm not sure how much this will grow if we combine all the tests, but this may be reason enough to keep them split up. |
Original Redmine Comment OK, so I'm not sure what to do with --vltml. Based on reading the man page and this issue: I'm guessing that based on the number of tests that skip the vltmt scenario (95 by my count) some pruning already happened when multithreading was initially added. Based on that, my next step here is to skip these tests under --vltmt if the detected core count is <= 2. Please let me know if you think we should be doing something else. There are also two other tests that fail with different error messages:
but based on the error messages, it sounds like variations on the same theme. |
Original Redmine Comment As advertised, I am now skipping these failing vltmt tests if the core count is < 3. I'm guessing that the large number of tests with
are tests that can't run under vltmt. I just replicated that behavior. However, if I'm interpreting this correctly, doesn't that mean that these tests won't run under other simulators? It seems like we'd rather want something like "vltmt => 0", but that doesn't appear to work like I imagined. I believe this is ready to pull now: On my travis-ci.com dashboard there is a button to request access for GitHub organizations I'm in. Once we land that, I'll click that and we'll see what happens. |
Original Redmine Comment UNOPTTHREADS is a bit finicky, it's based on the parallelism it can find Tests with vlt => 1 generally are that way because while they could run
Generally "scenarios" has indicated where a test case should be run. If if ($Self->too_few_cores) { (Generally anythign called by a t/ file should use $Self instead of as ::
If that doesn't work or maybe after should we make a verilator organization
This test was modified in trunk to skip with old make, so installing make
To answer earlier question --enable-maintainer-mode does nothing at
Nit, $Vltmt_Threads as in perl scripts the convention is to Capitalize global variable names.
I'd like to have this test especially working in travis, as is only check |
Original Redmine Comment
Oh yeah, I don't want this to run under my account. I just think this button I see is the way to request that the organization (presumably authorized by you) allow Travis to connect to its GitHub account. I've never done this before, so we'll see how it goes. But the end state should be this happening in Travis as the Verilator GitHub organization. I believe I've incorporated all of the feedback and now the only tests that are failing are t_gantt and t_inst_tree_inl0_pub1. The latter also appears to be specifically testing for three threads. I'll see what I can do about adapting these two tests. |
Original Redmine Comment Sorry, what I meant for t_gantt was just pass --threads 2 always then fix the test. I just pushed that change. I think your deltas are fine, you can push your branch to verilator/verilator's master yourself when you feel ready. As you're a verilator project maintainer hopefully that's enough for travis to let you make a verilator project account; otherwise if I need to do something instead let me know. |
Original Redmine Comment OK, I believe everything is push-able now: All the tests are green and I think I've got everything else cleaned up.
Wow, that's weighty. I tried this, but GitHub said no:
Also, I'd like to discuss how you see this working mechanically, e.g. should I update Changes before pushing. We can take that offline if you'd like.
I think we need something else. I clicked the button but it's now waiting on confirmation from the organization owner, which I believe is you. No rush, but I expect you have an email or something. |
Original Redmine Comment Fixed a mis-setting, you should be able to push. If other problems might want to use git rather than http as a transport. Yes, when pushing generally please update Changes. This stuff doesn't really impact users, so wouldn't update it. Found the auth email and approved it. |
Original Redmine Comment OK, the push works now and I see the authorization for Travis. I'm not sure how to tell Travis to start watching the organization's repos. I'll be offline through the weekend, but will figure this out next week. |
Original Redmine Comment FYI Travis caught I botched t_flag_csplit on old makes, fixed (I think). |
Original Redmine Comment FYI got a false failure. I added automatic retry of failures (long overdue). Also turned down the verbosity. |
Original Redmine Comment
|
Original Redmine Comment
|
Original Redmine Comment Yeah, we only have three workers to play with regardless of OS, compiler, etc. I've set it up so that we can get the ~30 minute builds we've been seeing usually: Please let me know what you think of this strategy. YAML is here: I also added a Travis badge to the README. It doesn't show up when you run perldoc, but GitHub renders it. Still TODO:
|
Original Redmine Comment Couple more notes: I'm pretty sure that t_leak is not flaky but just failing under Ubuntu 14.04. It probably looked flaky because as we are running with "os: linux" Travis is free to pick whichever version it wants. It looks like the Vt_leak binary finishes with a bad return code and without printing anything. I'll have to look into testing this in a container or VM because debugging through Travis does not sound fun. If this is true, do you want to turn off --rerun? I don't think I've ever seen a test flake out and I think it would be good to know if that is happening. Also, you mentioned:
However, please note that the latest Ubuntu version Travis has is currently 16.04: I'm letting Travis pick the Ubuntu version on pushes (presumably this will trend toward the latest version they have) and explicit version in the cron'ed jobs. Also, that GitHub branch above is very much a moving target because I have to push so much to test with Travis. If you want to take a look at the point in time I was talking about there, please see: |
Original Redmine Comment Feel free to push the badged README.pod. The more_travis branch currently seems to have hackage to debug t_leak so don't push that. I started a VM with Ubuntu 16.04 and the leak test passed. If we have to skip it on travis we can. I like having the rerun, as now print almost nothing the first time, then verbose on the rerun, so the travis logs look reasonable. |
Original Redmine Comment FYI I disabled t_leak.pl for now when under travis. |
Original Redmine Comment Ok, cool. I'm pretty sure it's only happening under 14.04 in Travis (and only --vltmt). I tried to reproduce under docker and a VM but could not. I'm checking for that specific distro under Travis in my branch and skipping. So if you want, I can put that in so we get the test under 16.04. My cron/push YAML split seems to work except I bumped into a 50 minute runtime limit. So we still need the three way test split for the crib jobs. I'm trying to find a way to do this without a copious amount of YAML. So I believe gcc under 16.04 and 14.04 is working now. I still need to look at osx and clang. And yeah, I really like the quiet output especially for Travis. I'm just worried that --rerun will allow currently solid tests to become flaky. But since Travis isn't the only place these tests are run maybe it doesn't really matter and preventing false positives is more worthwhile. |
Original Redmine Comment I don't know if it's the only clang problem, but there is a systemic problem when using clang under Travis which is caused by -faligned-new. I read the discussion in #� and see that you added a canary to configure to test for this issue and remove the compiler flag. However, in this setup we're passing the test:
but still failing when we try to compile Verilated code:
But when I just pull the flag out of verilated.mk things work: I'm guessing there is some environment problem here, but I also don't understand why the sanity check in configure doesn't fail. Any thoughts you have on this would be appreciated. |
Original Redmine Comment I was thinking that configure compiles but doesn't link, but it does seem to try a link, so I'm at a loss. |
Original Redmine Comment This is ready to push/discuss next steps: Here is the corresponding build: And cron build: Some notes: I believe t_leak.pl only fails under Ubuntu 14.04 with --vltmt. I changed the test to skip only that scenario so we can get more coverage. t_sys_sformat_noopt.pl was skipping clang versions between 3.8 and 5.0, but I'm seeing it fail both in Travis and locally with clang versions 7 and 8. I've disabled this for clang in general. I've run into two issues so far with osx (sudo cpan isn't working for me and config_rev.pl is failing). I haven't seriously used an Apple since my IIc+. I'm going to leave this for someone who has access to one of Mr. Jobs's computers. The YAML is less terse than I would like. I checked with the Travis support folks and this seems to be the best way to:
Currently cron tests the configurations you suggested (minus osx) and push initiated jobs will kick off a random linux distribution. The latter gives us the advantage of picking up later Ubuntu distributions for free when Travis upgrades. Once I push this change, I'll turn on daily a cron job for the main Verilator project like I have on my fork. As is, I would plan to open a couple follow-on issues:
|
Original Redmine Comment Great, again. "more-travis" looks fine please squash merge it to trunk (or if get it to one commit just normal pull to trunk). I have an idea who I can sign up for OSX, as I also don't use it, which was part of the motivation to get it regressing, let me ask him. |
Original Redmine Comment Squashed, pushed and the tests are green. Also for reference, #� and #� were opened as discussed above. |
Original Redmine Comment Awesome work! |
Original Redmine Comment P.S. Note that github just announced free open source CI via Actions. From what I can tell it's two active actions each with two cores, so Travis is still better throughput for now. |
Author Name: Todd Strader (@toddstrader)
Original Redmine Issue: 1468 from https://www.veripool.org
Original Assignee: Todd Strader (@toddstrader)
This builds Verilator and runs the entire test suite within Travis-CI:
https://github.com/toddstrader/verilator-dev/tree/travis
I've just been working with the free-for-open-source service via GitHub. Here's the build that corresponds to HEAD of this branch:
https://travis-ci.com/toddstrader/verilator-dev/builds/115871166
I surveyed the free CI options available on GitHub. Travis seems widely used, it is easy to set up and as far as I can tell there are no build/CPU-hours/etc. limits for open source projects. It's pretty simple to get going for your personal fork on GitHub. Based on experimentation, you seem to be able to launch three workers in parallel for this tier of service. I'm not currently doing anything with that (see below). The workers seem to have two CPUs, which I do leverage in the .yml.
The .yml is a bit hairy with the make version upgrade and cpan installation. I think it might be better to add a ci/ directory to hold helper scripts, but I don't think that's necessary at this point. I'm open to suggestions on this.
I'm also open to explore (a limited number of) other services if people have suggestions. But this one seems to work fine.
Future work would include:
I can open issues for these items if we proceed.
Also (and I don't mean to steal any thunder here), I noticed this today:
https://github.com/verilator/verilator
Does this mean that Verilator is going to be mirrored or hosted here? I was planning to actively sync from veripool.org to my repo to get CI going. But if this is going to be constantly up-to-date one way or the other, we can easily set it up there as well. The biggest advantage with that is that incoming PRs will kick off a CI build. So whenever people get around to looking at PRs, hopefully the tests are already done.
On a separate note, I got to thinking about this because of the recent Ariane breakage. I think it would be helpful to build well maintained public Verilog projects head-to-head with Verilator. That's probably outside the scope of Verilator itself, but would give us more thorough coverage if it can be set up. I'm not quite sure how that should work, but I'll post something on the forums when I figure something out so people can tune in if they are interested.
The text was updated successfully, but these errors were encountered: