Ehren's Blog

Towards a Giant Patch

Posted in Seneca by ehren on November 30, 2009

Well, I just spent a considerable amount of time trying to parse C++ with a regex, which wasn’t particularly enjoyable. Since I last posted, I’ve been able to pair down outparams.js to just what I need for my project. It’s basically now just a ZeroNonzero analysis together with a post analysis to round up all the alwayszero functions. This has resulted in increasing the number of functions found to around 6300. After getting a little help from ctyler to write a script that displays 100 lines of context around these functions, I’m also pretty certain that they all really do return zero, too.

The next step was finding the location of the function declarations within their respective classes. Of course, this information is not stored with the tree node representing a function declaration. Not that it matters, but apparently GCC considers the function name right before the definition to be the declaration (and I guess when present anywhere else it’s a forward declaration). Anyway, the next best thing is getting the path and line number of the class definition, which I’ve done, so I can then search for the function name from that point on.

One problem with this regex approach is that if I wanted to be exact about the matches, I’d need to consider type information about the function’s parameters. It’s actually quite easy to get this info with Treehydra, but there are a number of complications. Default parameter values eg int foo(int x = 0);, are one monkey wrench, for example. One thing I’m stuck on is being able to place shell variables into the lhs of a sed regex while still preserving the ability to use characters with special meanings like .* etc (being able to place special characters into the substituted variable would be even better). To get around this problem, I’ve checked in Treehydra whether the function is overloaded in the class and if so, printed a message about it. I can then exclude such functions, which only amount to about 100, from my results, simplifying things considerably.

Another problem is that many, perhaps the majority, of these functions are hidden behind macros. This might not be such a bad thing though, since a relatively small number of manual edits could perhaps affect thousands of declarations. There are some relative paths in the analysis results as well, but the instances are few enough that manual edits are feasible.

Anyway, I do have some results of declarations which are ready to be patched here. There’s only 1782 of them which means more than two thirds must be handled in other ways. What I might do is finalize the analysis to not only emit errors when my attribute has been applied to a function that may not return zero, but also warn about those remaining functions that should have the attribute but don’t. I can then decide how to proceed, once I’ve got a patched and plugin enabled build up and running.

There’s also the Pork route, which by way of Elsa, has the apparent advantage of providing both the location of the function definition and also the location of the declaration within the class. It should be able to handle the macros too but my understanding’s pretty murky here. Beyond getting the thing built, I haven’t looked into it that much, though.

Advertisements

Help from GNU, Analysis update, and possible contrib

Posted in Seneca by ehren on November 26, 2009

A bunch of stuff has happened regarding a Firefox build with GCC 4.5. I ended up filing a bug report about the link error I mentioned in a previous post. After sending a message to the GCC mailing list referring to it, I received a swift and helpful response. This led to an easy fix on the Mozilla side.

In other news, I reran the outparams.js based analysis I posted before. This time I only checked virtual member functions ie if (!DECL_VIRTUAL_P(func_decl)) return; . The result after removing as many duplicates as possible (without performing any substitutions) is a surprising 2664 alwayszero functions (2512 are in .cpp files and should thus be unique). At this point I’m a bit suspicious of these numbers especially since my earlier Dehydra analysis only turned up 243. I’ll have to go over them carefully tomorrow.

Here’s the list, btw.

Actually, if anyone’s looking for a contribution opportunity in OSD600/DPS909, finding any functions, even one, in the list that are not virtual or that don’t always return 0 would be helpful.

Reworking my analysis

Posted in Seneca by ehren on November 24, 2009

This weekend I’ve been stuck trying to come up with a Treehydra analysis similar to outparams.js that will root out those functions which always return zero (the ‘syntactic’ Dehydra analyis I posted previously was a bit naive). Trying to get this done for my 0.2 though was a grave mistake. It’s somewhat frustrating because I know this script already does what I want, namely it will ensure that an ‘out parameter’ is written upon a zero return. However, my aim is slightly different in that I don’t care about any other properties of the state when I’m in a state that contains a zero return; I just care that every state with a return is a zero-return state.

There’s another file that’s used by outparams.js called mayreturn.js that “determines the set of variables that may transitively reach the return statement” (eg w = x; x = y; y = z; return z;). It works by checking for either a RETURN_EXPR or a GIMPLE_MODIFY_STMT, in an instruction under consideration. If a RETURN_EXPR is found, the variable being returned is added to the current state. If a GIMPLE_MODIFY_STMT is found (ie an assignment) and the lhs is already in the state then the rhs is added to the state. Because this is implemented in the flow function, a change in the state will initiate another iteration through all the blocks of the function with the flow function called on each instruction. Or rather I suppose the iteration is done over only the predecessor blocks (given that mayreturn implements a ‘backward analysis’).

The main thing I’m confused about is that mayreturn.js seems to identify only a single variable as the ultimate return variable. This is even though a BackwardAnalysis (which mayreturn implements) iterates over every block (at least initially). As to its use in outparams.js, the mayreturn analysis is applied here, in process_tree which I’m certain is only invoked once.
So how are multiple returns handled?

This mozilla wiki page on abstract interpretation has been incredibly helpful in getting an idea of how all this stuff works. Going off it, here would be my idea for an always zero return checker:

I think the abstract values are already defined for me ie Zero_NonZero.Lattice.ZERO and Zero_NonZero.Lattice.NONZERO. Maybe I’m confused here and I actually need a new state ie RETURNS_ZERO and RETURNS_NON_ZERO.

The flow function should be something like ‘If stmt is an assignment by the zero constant, and if the lhs of the assignment is in the transitive return variable set identified by mayreturn.js, then set the state to ZERO (or RETURNS_ZERO).

Because I’m still unclear how multiple returns are handled though, I’m not really sure how to get started. There is a place in outparams.js where a given substate is checked for whether it returns zero (and if it does and an outparam is not written to, a warning is issued). Likewise, there is also a check for whether an outparam is written to upon returning a failure code. Unfortunately, the notion of ‘substate’ is at the heart of ESP analysis which is not explained in that Mozilla wiki page I posted above… although there’s an ESP heading with no body :). My vague understanding is that now you’re dealing with sets of states. My only recourse in getting more info is this paper where ESP (Error Detection via Scalable Program Analysis) is introduced. (Naturally it’s 100% crazy stuff).

Edit: hmm… just as I wrote that last paragraph something dawned on me. I think everything I just wrote is nonsense or at least irrelevant (I’ll leave it though because I have to get my blogging up).

Anyway, consider the checkSubstate function in outparams.js. I tried modifying it as well as the post analysis error checker where it’s called:

diff -r 41c1b69b3ed3 xpcom/analysis/outparams.js
--- a/xpcom/analysis/outparams.js	Mon Nov 23 22:17:06 2009 -0500
+++ b/xpcom/analysis/outparams.js	Tue Nov 24 00:20:11 2009 -0500
@@ -522,38 +522,48 @@ function unwrap_outparam(arg, state) {
   }
   if (outparam) return outparam;
   return arg;
 }
 
 // Check for errors. Must .run() analysis before calling this.
 OutparamCheck.prototype.check = function(isvoid, fndecl) {
   let state = this.cfg.x_exit_block_ptr.stateOut;
+  var alwayszero = true;
   for (let substate in state.substates.getValues()) {
-    this.checkSubstate(isvoid, fndecl, substate);
+    if(!this.checkSubstate(isvoid, fndecl, substate)) {
+      alwayszero = false;
+      break;
+    }
+  }
+  if (alwayszero) {
+    print("alwayszero function found! location: " + location_of(this.fndecl) + " name: " +  function_decl_name(this.fndecl));
+  }
 }
 
 OutparamCheck.prototype.checkSubstate = function(isvoid, fndecl, ss) {
   if (isvoid) {
+    return false;
     this.checkSubstateSuccess(ss);
   } else {
     let [succ, fail] = ret_coding(fndecl);
     let rv = ss.get(this.retvar);
     // We want to check if the abstract value of the rv is entirely
     // contained in the success or failure condition.
     if (av.meet(rv, succ) == rv) {
+      return true;
       this.checkSubstateSuccess(ss);
     } else if (av.meet(rv, fail) == rv) {
+      return false;
       this.checkSubstateFailure(ss);
     } else {
       // This condition indicates a bug in outparams.js. We'll just
       // warn so we don't break static analysis builds.
       warning("Outparams checker cannot determine rv success/failure",
               location_of(fndecl));
+      return false;
       this.checkSubstateSuccess(ss);
       this.checkSubstateFailure(ss);
     }
   }
 }
 
 /* @return     The return statement in the function
  *             that writes the return value in the given substate.

This is ugly hackery by it seems to do the trick.

Anyway, click here to see all the 3074 functions found. Checking for duplicates ie cat outparam-like-analysis.txt | sort | uniq | wc -l, I get 2874 which would seem to be accurate given that I’m dealing with absolute filename paths (ie the many instances when an alwayszero function is present in a header file should be taken care of). Interestingly this is way more than the the 396 functions found by another (unposted) analysis that I previously tried which checked for alwayszero methods (either virtual or non-virtual). See this post for a similar analysis.

There may be a few false positives too, but hopefully not a great many (I haven’t thoroughly checked the results so they could be totally wrong). At this point I’m not too concerned about interpreting the data because there are lots of non-virtuals in the mix. I think this can be changed with a function similar to is_constructor, though. I will have to strip out all of the non-relevant code which is not completely trivial, but it shouldn’t be too hard. I suppose I don’t really need to do this until the final version, which should be a safety checker for my plugin ie if a function has the alwayszero attribute then it must always return zero. I’ll have to solve my plugin/build issues first, though.

Firefox build issues with GCC 4.5

Posted in Seneca by ehren on November 23, 2009

I’ve hit a large snag with my project. Unfortunately, getting a plugin to work with the backported gcc 4.3.4 is not as straightforward as I had assumed. I badly misread the ifdefs in dehydra_plugin.c here, thinking I’d be able to define my own pass struct and hook it in like with 4.5. Oops.

As far as I can tell If I want to run my optimization with the patched 4.3 I’ll have to add my gimple manipulations to one of the callbacks used by Treehyhdra. This might be doable but I haven’t tried it. The alternative is to create another patch against 4.3.4, either adding a new pass, as with the plugin, or by trying a bit of conditional constant propagation hackery as with my 0.1. The solution may not be ideal in either case.

As an attempt to sidestep the issue, I’ve been attempting to get Firefox to compile with 4.5. I was a bit confused until asking on #gcc about what’s needed in the bug report. Namely whether I need to include only the preprocessed .ii version of the ‘problem file’ or whether I need preprocessed versions of every file in the directory, as I had seen with another report (especially since I can’t get the .ii file to compile independently). Someone informed me that the one .ii is enough and so here’s the new bug. So far no one’s bitten yet so I think I’ll have to reproduce the problem with the irrelevant functions from jsxml.cpp taken out.

Anyway, this bug can be sidestepped by turning down -O3 to -O0, but I’ve hit another issue during the linking of another file. Here’s the output. I’ve reproduced the issue with a development version of binutils as well but I’m thinking this is actually a Mozilla bug. I’ll have to consult with someone who would know.

On the analysis front, I’ve got something in the works about my recent travails with Treehydra but now that I’ve started to write it I think it needs it’s own post.

Analysis and type madness

Posted in Seneca by ehren on November 18, 2009

I’ve created a simple Dehydra script to check for alwayszero functions. There are some issues however. Here’s the script:

function process_function(f, body)
{
  if (f.type.type.name == 'void' || !f.isVirtual) {
    return;
  }

  var alwayszero = true;

  function processStatements(stmts) {
    for(var j = 0; j < stmts.statements.length; j++) {
      var s = stmts.statements[j];
      if (s.isReturn && s.value != 0) {
        alwayszero = false; 
      }
    }
  }

  for (var i = 0; i < body.length; i++) {
    processStatements(body[i]);
  }

  if (alwayszero) {
    print("alwayszero function: " + f.loc.file + " on line " + f.loc.line +
          ", column " + f.loc.column + " : " + f.type.type.name + " " + f.name);
  }
}

I actually have a more elegant version using a for in loop and iterate_vars but after watching that Douglas Crockford lecture I’m a bit paranoid about such things (both report the same results though).

After doing a bit of post-processing on the output I can report the following:

Functions meeting the above criteria are encountered 5920 times during compilation but most of these are duplicates since there are quite a few method definitions within include files.

Removing all of the duplicates reduces the number to 243 instances encountered (shouldn’t be too difficult to patch manually or with some shell script hackery).

A link to the list of 243 unique instances is here. The full output list can be obtained here (warning: big file).

Unfortunately I have encountered at least one false positive with this script:

gfxFont.cpp on line 960, column 35 : gfxFont::RunMetrics gfxFont::Measure(gfxTextRun*, PRUint32, PRUint32, gfxFont::BoundingBoxType, gfxContext*, gfxFont::Spacing*)

Viewing the definition here, I’m not sure why this function was captured by my analysis but I’ll have to investigate. Ultimately, I think doing this with a Treehydra script might make more sense. There’s already an existing script that does all the heavy lifting, but I haven’t been able to link this into an independent analysis pass on mozilla-central (yet).

Edit: Argh… found another false positive here:
nsSVGContainerFrame.cpp on line 264, column 82 : gfxRect nsSVGDisplayContainerFrame::GetBBoxContribution(const gfxMatrix&)

More Issues:

When I get the plugin ready to go with mozilla, I’ll add a script to xpcom/analysis to ensure that every function which has the user(("alwayszero")) attribute meets the above criteria.

As to excluding non-virtual member functions from the analysis, as far as I can tell GCC should already be able to optimize these away. Perhaps there’s a corner case I haven’t considered though.

Another issue I thought of today is that this script can potentially flag floating point functions as alwayszero which will result in bad things happening if I apply my plugin as originally written to them.

Luckily a fix to the plugin was not that difficult:

static bool
sane_tree_type (tree t)
{
  enum tree_code code = TREE_CODE (TREE_TYPE (t));
  return code == INTEGER_TYPE || code == BOOLEAN_TYPE;
}

 /* other stuff --- see previous posts for context */

static unsigned int
execute_alwayszero_opt (void)
{
  gimple_stmt_iterator gsi;
  basic_block bb;

  FOR_EACH_BB (bb)
    {
      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
        {
          gimple stmt = gsi_stmt (gsi);

          tree lhs = gimple_get_lhs (stmt);

          if (is_gimple_call (stmt) && lhs != NULL_TREE &&
              is_alwayszero_function (stmt) && sane_tree_type (lhs))
            {
              tree zero = build_int_cst (TREE_TYPE (lhs), 0);
              gimple assign = gimple_build_assign_stat (lhs, zero);
              tree var = create_tmp_var (TREE_TYPE (lhs), "dummy_var");
              add_referenced_var (var);
              mark_sym_for_renaming (var);
              gimple_call_set_lhs (stmt, var);
              gsi_insert_after (&gsi, assign, GSI_CONTINUE_LINKING);
            }
        }
    }

  return 0;
} 

This will exclude pointer types as well which is probably a good thing. If it’s really necessary to optimize away alwayszero REAL_TYPE functions this could be done as well, but they will have to be handled separately. Actually this should be pretty easy but it’s not really a priority at the moment. It probably makes more sense to check the type of the function rather than that of the lhs, but consider the above a proof of concept.

The next step is getting my plugin to work with the GCC 4.3.4 with backported plugin support used to build Dehydra which I now know will not be too difficult. The main issue is that the gimple_stmt_iterator is nowhere to be seen (even though it shows up in the Changelog … wtf) so I’ll have to use the block_stmt_iterator instead which will require a slight bit more ugliness.

The main problem though is getting a plugin to build under the GCC 4.3.4 with backported plugin support used to build Dehydra. 4.5 includes a separate plugin include directory and an easy to use -print-file-name=plugin flag to get at the directory. I’ve tried hacking a build rule into the existing configure script in Dehydra but I’m a complete noob with make and configure stuff so I haven’t had any success yet. Hopefully I can rectify this shortly.

Also, I’ve been attempting to get FF built with 4.5 but have encountered some issues. Once I get a nice sequence of steps that follows the instructions on this page I’ll bring this to the attention of the GCC people.

i686 vs x86-64 issues

Posted in Seneca by ehren on November 17, 2009

Falling behind with my blog is not a good thing because I have a lot to report. Some/most of it is old news by now but I need to get it up here. Incidentally I’ve completed a rudimentary Dehydra analysis on the mozilla-central repository so that my plugin can eventually be applied, however I will present the results in my next post (here’s the post, btw).

Anyway, last week, upon embarking on the static analysis component of this project, I ran into trouble using my GCC plugin on two of the CDOT’s x86_64 Fedora boxes. The plugin compiled, but using it on any test code invoked an internal compiler error (a segmentation fault). This was quite troubling since the plugin worked fine when compiled for i386 (tested with my personal Debian machine and another Fedora i686 system at Seneca).

Ultimately I solved the problem by finding a slightly different way of coding the optimization, but I’ll document a few of the steps I’ve attempted.

Although I have now updated it with the fix, the plugin code of my previous post contained the following (see that post for the context):

  tree lhs = gimple_get_lhs (stmt);
  tree zero = build_int_cst (TREE_TYPE (lhs), 0);
  gimple assign = gimple_build_assign_stat (lhs, zero);
  tree var = make_rename_temp (TREE_TYPE (lhs), "dummy_var");
  gimple_call_set_lhs (stmt, var);
  gsi_insert_after (&gsi, assign, GSI_CONTINUE_LINKING);

One clue about the issue is that the call on line 4 to make_rename_temp warns “initialization makes pointer from integer without a cast”. At the time my thoughts on the issue were “who cares, it works”. Even getting the ICE (which complained about the call to make_rename_temp), I figured that changing the code wasn’t the thing to do. Instead a convoluted journey of debugging the plugin with gdb, examining the registers, etc, etc, was what I had in mind. Conclusion: the call to make_rename_temp results in a crash.

In any event, I found a solution although it’s been long enough that I can’t quite remember how I arrived at it:

  tree lhs = gimple_get_lhs (stmt);
  tree zero = build_int_cst (TREE_TYPE (lhs), 0);
  gimple assign = gimple_build_assign_stat (lhs, zero);
  tree var = create_tmp_var (TREE_TYPE (lhs), "dummy_var");
  add_referenced_var (var);
  mark_sym_for_renaming (var);
  gimple_call_set_lhs (stmt, var);
  gsi_insert_after (&gsi, assign, GSI_CONTINUE_LINKING);

All of that add_referenced_var/mark_sym_for_renaming business is because the call to create_tmp_var creates a temporary that’s not in SSA form ie you’ll have an initialization like int dummy_var.0; and then when it’s used you’ll get dummy_var.0 = 0; when what you want is dummy_var.0_1 = 0;.

Looking at the code for mark_sym_for_renaming in tree-dfa.c, note that it’s basically identical to the code I used to replace it’s call:

tree
make_rename_temp (tree type, const char *prefix)
{
  tree t = create_tmp_var (type, prefix);

  if (TREE_CODE (TREE_TYPE (t)) == COMPLEX_TYPE
      || TREE_CODE (TREE_TYPE (t)) == VECTOR_TYPE)
    DECL_GIMPLE_REG_P (t) = 1;

  if (gimple_referenced_vars (cfun))
    {
      add_referenced_var (t);
      mark_sym_for_renaming (t);
    }

  return t;
}

So where is the error with my original approach given that it works on i686? I’m thinking that the representation of a tree as a union of tree nodes may be at issue ie I was sticking the return of make_rename_tmp where it should not be stuck.

Although it wasn’t of much use to me, I’ll present a quick example of debugging GCC with gdb for those interested (this applies to GCC plugins as well). I have found this page, on debugging a segmentation fault, particularly useful. For those who just want to step through the compiler for kicks, those instructions will work as well.

Here’s a concrete example:
I’ve installed my GCC at /home/ehren/gcc/dist/bin/gcc and am compiling hello.c which is in my home directory. The compiler was produced with -g3 -O0.

First run the compiler with -v to determine the actual command passed to cc1 or cc1plus (gcc or g++ is just a driver for these programs)
~$ gcc/dist/bin/gcc hello.c -v

The output will contain the following line:
/home/ehren/gcc/dist/libexec/gcc/i686-pc-linux-gnu/4.5.0/cc1 -quiet -v hello.c -quiet -dumpbase hello.c -mtune=generic -auxbase hello -version -o /tmp/cctQO1Kb.s

Now invoke cc1 with gdb:

~$ gdb /home/ehren/gcc/dist/libexec/gcc/i686-pc-linux-gnu/4.5.0/cc1

And then run it using the arguments given in the line above (the -v is now unnecessary but there’s no harm in including it). Set a breakpoint first if you’re not getting an internal compiler error:

(gdb) break main
(gdb) run -quiet -v hello.c -quiet -dumpbase hello.c -mtune=generic -auxbase hello -version -o /tmp/cctQO1Kb.s

And you’re good to go.

Creating a GCC optimization plugin

Posted in Seneca by ehren on November 4, 2009

Since I last posted a number of developments have taken place with regard to my alwayszero function attribute for gcc bug. I mentioned before that it would probably be necessary to move the optimization into its own pass. Furthermore, that pass should exist within its own GCC plugin. It’s fairly obvious that imbedding the optimization within the conditional constant propagation algorithm of tree-ssa-ccp.c was not going to scale to a separate pass so I had to consider other methods.

Interestingly, I ended up implementing almost exactly what was suggested in this comment by jmdesp except it was necessary to make sure that the gimple call remained a value returning gimple call ie you can’t just set the left hand side to null. I may add the code to bugzilla but I’ll include my new pass here for completeness (and for gcc related google result spam purposes):

/*
   alwayszero_plugin.c

   This plugin contains an optimization pass that affects all functions tagged
   as __attribute__((user("alwayszero"))) (typically these functions already
   return 0 yet GCC cannot determine this).
   
   A statement of the form x = call(); where call is alwayszero will be 
   transformed into dummy_var = call(); x = 0;. Further GCC optimization
   passes will eliminate dummy_var and propagate the value 0 into subsequent
   uses of x.

   Ehren Metcalfe
 */

#include "gcc-plugin.h"
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "tm.h"
#include "toplev.h"
#include "basic-block.h"
#include "gimple.h"
#include "tree.h"
#include "tree-pass.h"
#include "intl.h"
#include <string.h>

int plugin_is_GPL_compatible;

/* Attribute handler callback */

static tree
handle_user_attribute (tree *node, tree name, tree args,
                       int flags, bool *no_add_attrs)
{
  return NULL_TREE;
}

/* Attribute definition */

static struct attribute_spec user_attr =
  { "user", 1, 1, false,  false, false, handle_user_attribute };

/* Plugin callback called during attribute registration */

static void
register_attributes (void *event_data, void *data)
{
  register_attribute (&user_attr);
}

/* Checks if stmt is a gimple call tagged with 
   __attribute__((user("alwayszero"))) */

static bool
is_alwayszero_function (gimple stmt)
{
  if (!is_gimple_call (stmt))
    return false;

  tree fndecl = gimple_call_fndecl (stmt);
  const char* attrarg = NULL;

  if (fndecl != NULL_TREE)
    {
      tree attrlist = DECL_ATTRIBUTES (fndecl);
      if (attrlist != NULL_TREE)
        {
          tree attr;
          for (attr = lookup_attribute("user", attrlist);
               attr != NULL_TREE;
               attr = lookup_attribute("user", TREE_CHAIN (attr)))
            {
              attrarg = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr)));
              if (strcmp(attrarg, "alwayszero") == 0)
                return true;
            }
        }
    }
  return false;
}


/* Entry point for the alwayszero optimization pass.
   Creates a new assignment statement with the lhs of the alwayszero call
   then swaps out the old lhs with a new dummy temporary */

static unsigned int
execute_alwayszero_opt (void)
{
  gimple_stmt_iterator gsi;
  basic_block bb;

  FOR_EACH_BB (bb)
    {
      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
        {
          gimple stmt = gsi_stmt (gsi);
          if (is_gimple_call (stmt) && gimple_call_lhs (stmt) != NULL_TREE &&
              is_alwayszero_function (stmt))
            {
              tree lhs = gimple_get_lhs (stmt);
              tree zero = build_int_cst (TREE_TYPE (lhs), 0);
              gimple assign = gimple_build_assign_stat (lhs, zero);
              tree var = create_tmp_var (TREE_TYPE (lhs), "dummy_var");
              add_referenced_var (var);
              mark_sym_for_renaming (var);
              gimple_call_set_lhs (stmt, var);
              gsi_insert_after (&gsi, assign, GSI_CONTINUE_LINKING);
            }
        }
    }

  return 0;
}


static struct gimple_opt_pass pass_alwayszero =
{
  {
    GIMPLE_PASS,
    "alwayszero",                         /* name */
    NULL,                                 /* gate */
    execute_alwayszero_opt,               /* execute */
    NULL,                                 /* sub */
    NULL,                                 /* next */
    0,                                    /* static_pass_number */
    0,                                    /* tv_id */
    PROP_cfg | PROP_ssa,                  /* properties_required */
    0,                                    /* properties_provided */
    0,                                    /* properties_destroyed */
    0,                                    /* todo_flags_start */
    TODO_dump_func
      | TODO_verify_ssa
      | TODO_update_ssa                   /* todo_flags_finish */
  }
};

/* The initialization routine exposed to and called by GCC. The spec of this
   function is defined in gcc/gcc-plugin.h. */

int
plugin_init (struct plugin_name_args *plugin_info,
             struct plugin_gcc_version *version)
{
  struct register_pass_info pass_info;
  const char *plugin_name = plugin_info->base_name;
  int argc = plugin_info->argc;
  struct plugin_argument *argv = plugin_info->argv;

  /* Handle alwayszero functions near conditional constant propagation */
  pass_info.pass = &pass_alwayszero.pass;
  pass_info.reference_pass_name = "ccp"; 
  pass_info.ref_pass_instance_number = 1;
  /* It seems more logical to insert the pass before ccp, but:
     A) this does the trick anyway, even with regard to dead branch elimination
     B) inserting directly before ccp prevents recognition of user attributes
        for some reason
     C) this pass can go almost anywhere as long as you're in SSA form
   */
  pass_info.pos_op = PASS_POS_INSERT_AFTER;

  /* Register this new pass with GCC */
  register_callback (plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
                     &pass_info);

  /* Register the user attribute */
  register_callback (plugin_name, PLUGIN_ATTRIBUTES, register_attributes, NULL);

  return 0;
}

(edit: the code as originally posted would segfault when used during compilation on x86-64 systems. The above code addresses this problem. See this post for an overview of the fix.)

The bare outline of this code came from the plugin examples in the GCC testsuite. I should point out that it handles the probably rare case when multiple user attributes have been applied to one function declaration ie __attribute__((user("bleh"), user("alwayszero"), user("blah")))

One neat thing about creating a separate pass with GCC is that by applying the TODO_dump_func flag you get a whole bunch of debugging information (using -fdump-tree-all) including the state of the IR after your pass. For example, one attempt at writing the plugin left my test code in this state:

main ()
{
  int x;
  struct Blah b;

<bb 2>:
  b._vptr.Blah = &_ZTV4Blah[2];
  x_4 = Blah::blah (&b);
  x_4 = 0;
  x_5 = x_4;
  printf (&"%d\n"[0], x_5);
  return 0;
}

This is no good because I’m violating SSA. An earlier attempt left the IR in this state:

 
main ()
{
  int x;
  struct Blah b;

<bb 2>:
  b._vptr.Blah = &_ZTV4Blah[2];
  0 = Blah::blah (&b);
  x_5 = x_4;
  printf (&"%d\n"[0], x_5);
  return 0;
}

Important lesson: 0 is not an lvalue.

If you’re curious here’s the result of the final version of my pass (with a bit more info included):

;; Function main (main)

Symbols to be put in SSA form

{ dummy_var.0 }

Incremental SSA update started at block: 0

Number of blocks in CFG: 3
Number of blocks to update: 2 ( 67%)

main ()
{
  int dummy_var.0;
  int x;
  struct Blah b;

<bb 2>:
  b._vptr.Blah = &_ZTV4Blah[2];
  dummy_var.0_1 = Blah::blah (&b);
  x_4 = 0;
  x_5 = x_4;
  printf (&"%d\n"[0], x_5);
  return 0;
}

As an aside, I’m quite lucky to have found the make_rename_temp function because without it a hand crafted gimple call was one thing I was considering (quite a bit of madness is required in building the (arbitrary sized) argument list which is required for the gimple_build_call function).

Anyway, I guess the next step is a Dehydra script to check that every function to which this is applied really returns 0 (because if they don’t they will). A DXR search for NS_OK shows that quite a few functions can have this applied. Noting the results of tests with this optimization in my last post, I’d hope quite a bit of cruft can be eliminated.

Oh, one other irrelevant thing. I wonder if it was considered that bool plugin_is_GPL_compatible = false; is an acceptable definition to enable a plugin?

Btw, the following makefile is required to build the plugin:

GCC=~/gcc/dist/bin/gcc
PLUGIN_SOURCE_FILES= alwayszero_plugin.c
PLUGIN_OBJECT_FILES= $(patsubst %.c,%.o,$(PLUGIN_SOURCE_FILES))
GCCPLUGINS_DIR:= $(shell $(GCC) -print-file-name=plugin)
CFLAGS+= -I$(GCCPLUGINS_DIR)/include -fPIC -O2

alwayszero_plugin.so: $(PLUGIN_OBJECT_FILES)
	$(GCC) -shared $^ -o $@

(taken pretty much verbatim from the internals manual)

GCC should point to GCC 4.5. To enable the plugin use the GCC 4.5 flag:
-fplugin=/Path/To/alwayszero_plugin.so -O2

One more thing… it might be a bit silly to keep using the GNU coding style but I’ve grown a bit fond of it. For example, you can find a function definition immediately using grep '^function_name' *. Maybe I’m just going a bit crazy, however.