Skip to content

Output g2html directly #1752

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from
Draft

Output g2html directly #1752

wants to merge 9 commits into from

Conversation

sim642
Copy link
Member

@sim642 sim642 commented May 23, 2025

Currently for the g2html output Goblint produces:

  1. One massive XML file with all results.
  2. .dot files for all function CFGs.

Then g2html takes those and does the following:

  1. Splits the massive XML file into individual XML files for nodes, files, globals, warnings.
  2. Runs Graphviz on all the .dot files to get .svg files.
  3. Copies some static resources to the output directory: XML transformers, CSS styles, some JS and fonts.

This PR shows that we can easily cut out the middle man and get rid of the Java-based g2html altogether by doing the following directly in Goblint:

  1. Producing the individual XML files.
  2. Producing the .dot files and also running Graphviz on them.
  3. Copying the same static stuff.

The result is almost exactly the same HTML-like results viewing experience.

The only difference is that g2html includes a custom lexer for unpreprocessed C that's used for highlighting code in the g2html file view. This remake doesn't do highlighting, but I don't know if anyone even looks at the file view in g2html.
I think it's a minor loss for what otherwise allows us to get rid of an ancient Java-based component which just wastes time copying XML around and reconstructing liveness information.
Doing the same directly in OCaml means that this output is also available to Goblint installed via opam itself, where otherwise g2html isn't present.

TODO

  • Add syntax highlighting. Maybe using Pygments?
  • Remove old single-XML output?
  • Clean up new implementation code.

@sim642 sim642 added cleanup Refactoring, clean-up performance Analysis time, memory usage setup Dependencies, CI, releasing proof-of-concept labels May 23, 2025
@michael-schwarz
Copy link
Member

The only difference is that g2html [highlights] code in the g2html file view. This remake doesn't do highlighting

I think the highlighting is important, g2html and the highlighted reports are by far the most accessible way to look at Goblint results, and our CFGs are not really intuitive.

If we want people to be able to use Goblint who are not researchers, we should not tie ourselves to obscure things such as custom (semi-working) IDE plugins or niche formats such as YAML witnesses.

but I don't know if anyone even looks at the file view in g2html.

Is is the view I use most and also always recommend to the students. The CFG view feels too messy for me.


I don't really have stakes in the rest of it, g2html has worked reliably for many years and people tend to have Java installed, so I personally think there's more pressing issues then this. This said, I'm also not opposed as long as we don't make it worse by removing the syntax highlighting without gaining anything.

@DrMichaelPetter
Copy link
Collaborator

The only difference is that g2html [highlights] code in the g2html file view. This remake doesn't do highlighting

I think the highlighting is important, g2html and the highlighted reports are by far the most accessible way to look at Goblint results, and our CFGs are not really intuitive.

If we want people to be able to use Goblint who are not researchers, we should not tie ourselves to obscure things such as custom (semi-working) IDE plugins or niche formats such as YAML witnesses.

but I don't know if anyone even looks at the file view in g2html.

Is is the view I use most and also always recommend to the students. The CFG view feels too messy for me.


I don't really have stakes in the rest of it, g2html has worked reliably for many years and people tend to have Java installed, so I personally think there's more pressing issues then this. This said, I'm also not opposed as long as we don't make it worse by removing the syntax highlighting without gaining anything.

If highlighting is so important, can we maybe use external calls for this, similar to dot? There are plenty Linux tools that support highlighting in various output formats....

@sim642
Copy link
Member Author

sim642 commented May 24, 2025

Is is the view I use most and also always recommend to the students. The CFG view feels too messy for me.

Oh OK, it might be my personal preference for the CFG view then. With all the CIL transformations taking place, per-line abstract states can be confusing because they aren't just different paths on the line, but different (internal) points on the line.

If highlighting is so important, can we maybe use external calls for this, similar to dot? There are plenty Linux tools that support highlighting in various output formats....

Indeed, we could just rely on Pygments or something like that.
Or at worst, the grammar in g2html could probably be converted to ocamllex without too much hassle, although I'm not sure if maintaining our own lexer for unpreprocessed C is sensible. It hasn't been updated in so many years whereas we've added C11, etc support into CIL.

I don't really have stakes in the rest of it, g2html has worked reliably for many years and people tend to have Java installed, so I personally think there's more pressing issues then this. This said, I'm also not opposed as long as we don't make it worse by removing the syntax highlighting without gaining anything.

g2html currently tends to be awfully slow. For example on a trivial-size program:

$ ./regtest.sh 04 01 -v
[...]
                                  cputime   walltime   allocated   count
Default                            1.285s     0.675s     57.24MB      1×
  preprocess                         0.052s     0.049s      0.01MB      1×
  FrontC                             0.009s     0.009s      9.01MB      3×
  Cabs2cil                           0.012s     0.012s      8.35MB      3×
  mergeCIL                           0.005s     0.005s      3.52MB      1×
  analysis                           0.026s     0.033s     33.67MB      1×
  [...]
  g2html                             1.180s     0.564s      0.01MB      1×

The majority of time is spent running g2html which shuffles XML around. Using g2html on any remotely-realistic program (even for debugging Goblint itself) tends to be a pain, because it's often so much slower than the analysis itself.

@DrMichaelPetter
Copy link
Collaborator

Indeed, we could just rely on Pygments or something like that.
Or at worst, the grammar in g2html could probably be converted to ocamllex without too much hassle, although I'm not sure if maintaining our own lexer for unpreprocessed C is sensible. It hasn't been updated in so many years whereas we've added C11, etc support into CIL.

Without myself having a deeper insight into CIL yet - could one maybe use CIL components for lexing and coloring, since there is already an effort of entertaining lexing in there? Maybe we could even benefit from the parsing, since typedef handling with lexing only sounds awful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup Refactoring, clean-up performance Analysis time, memory usage proof-of-concept setup Dependencies, CI, releasing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants