Ehren's Blog

Creating a GCC optimization plugin

Posted in Seneca by ehren on November 4, 2009

Since I last posted a number of developments have taken place with regard to my alwayszero function attribute for gcc bug. I mentioned before that it would probably be necessary to move the optimization into its own pass. Furthermore, that pass should exist within its own GCC plugin. It’s fairly obvious that imbedding the optimization within the conditional constant propagation algorithm of tree-ssa-ccp.c was not going to scale to a separate pass so I had to consider other methods.

Interestingly, I ended up implementing almost exactly what was suggested in this comment by jmdesp except it was necessary to make sure that the gimple call remained a value returning gimple call ie you can’t just set the left hand side to null. I may add the code to bugzilla but I’ll include my new pass here for completeness (and for gcc related google result spam purposes):

/*
   alwayszero_plugin.c

   This plugin contains an optimization pass that affects all functions tagged
   as __attribute__((user("alwayszero"))) (typically these functions already
   return 0 yet GCC cannot determine this).
   
   A statement of the form x = call(); where call is alwayszero will be 
   transformed into dummy_var = call(); x = 0;. Further GCC optimization
   passes will eliminate dummy_var and propagate the value 0 into subsequent
   uses of x.

   Ehren Metcalfe
 */

#include "gcc-plugin.h"
#include "config.h"
#include "system.h"
#include "coretypes.h"
#include "tm.h"
#include "toplev.h"
#include "basic-block.h"
#include "gimple.h"
#include "tree.h"
#include "tree-pass.h"
#include "intl.h"
#include <string.h>

int plugin_is_GPL_compatible;

/* Attribute handler callback */

static tree
handle_user_attribute (tree *node, tree name, tree args,
                       int flags, bool *no_add_attrs)
{
  return NULL_TREE;
}

/* Attribute definition */

static struct attribute_spec user_attr =
  { "user", 1, 1, false,  false, false, handle_user_attribute };

/* Plugin callback called during attribute registration */

static void
register_attributes (void *event_data, void *data)
{
  register_attribute (&user_attr);
}

/* Checks if stmt is a gimple call tagged with 
   __attribute__((user("alwayszero"))) */

static bool
is_alwayszero_function (gimple stmt)
{
  if (!is_gimple_call (stmt))
    return false;

  tree fndecl = gimple_call_fndecl (stmt);
  const char* attrarg = NULL;

  if (fndecl != NULL_TREE)
    {
      tree attrlist = DECL_ATTRIBUTES (fndecl);
      if (attrlist != NULL_TREE)
        {
          tree attr;
          for (attr = lookup_attribute("user", attrlist);
               attr != NULL_TREE;
               attr = lookup_attribute("user", TREE_CHAIN (attr)))
            {
              attrarg = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr)));
              if (strcmp(attrarg, "alwayszero") == 0)
                return true;
            }
        }
    }
  return false;
}


/* Entry point for the alwayszero optimization pass.
   Creates a new assignment statement with the lhs of the alwayszero call
   then swaps out the old lhs with a new dummy temporary */

static unsigned int
execute_alwayszero_opt (void)
{
  gimple_stmt_iterator gsi;
  basic_block bb;

  FOR_EACH_BB (bb)
    {
      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
        {
          gimple stmt = gsi_stmt (gsi);
          if (is_gimple_call (stmt) && gimple_call_lhs (stmt) != NULL_TREE &&
              is_alwayszero_function (stmt))
            {
              tree lhs = gimple_get_lhs (stmt);
              tree zero = build_int_cst (TREE_TYPE (lhs), 0);
              gimple assign = gimple_build_assign_stat (lhs, zero);
              tree var = create_tmp_var (TREE_TYPE (lhs), "dummy_var");
              add_referenced_var (var);
              mark_sym_for_renaming (var);
              gimple_call_set_lhs (stmt, var);
              gsi_insert_after (&gsi, assign, GSI_CONTINUE_LINKING);
            }
        }
    }

  return 0;
}


static struct gimple_opt_pass pass_alwayszero =
{
  {
    GIMPLE_PASS,
    "alwayszero",                         /* name */
    NULL,                                 /* gate */
    execute_alwayszero_opt,               /* execute */
    NULL,                                 /* sub */
    NULL,                                 /* next */
    0,                                    /* static_pass_number */
    0,                                    /* tv_id */
    PROP_cfg | PROP_ssa,                  /* properties_required */
    0,                                    /* properties_provided */
    0,                                    /* properties_destroyed */
    0,                                    /* todo_flags_start */
    TODO_dump_func
      | TODO_verify_ssa
      | TODO_update_ssa                   /* todo_flags_finish */
  }
};

/* The initialization routine exposed to and called by GCC. The spec of this
   function is defined in gcc/gcc-plugin.h. */

int
plugin_init (struct plugin_name_args *plugin_info,
             struct plugin_gcc_version *version)
{
  struct register_pass_info pass_info;
  const char *plugin_name = plugin_info->base_name;
  int argc = plugin_info->argc;
  struct plugin_argument *argv = plugin_info->argv;

  /* Handle alwayszero functions near conditional constant propagation */
  pass_info.pass = &pass_alwayszero.pass;
  pass_info.reference_pass_name = "ccp"; 
  pass_info.ref_pass_instance_number = 1;
  /* It seems more logical to insert the pass before ccp, but:
     A) this does the trick anyway, even with regard to dead branch elimination
     B) inserting directly before ccp prevents recognition of user attributes
        for some reason
     C) this pass can go almost anywhere as long as you're in SSA form
   */
  pass_info.pos_op = PASS_POS_INSERT_AFTER;

  /* Register this new pass with GCC */
  register_callback (plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
                     &pass_info);

  /* Register the user attribute */
  register_callback (plugin_name, PLUGIN_ATTRIBUTES, register_attributes, NULL);

  return 0;
}

(edit: the code as originally posted would segfault when used during compilation on x86-64 systems. The above code addresses this problem. See this post for an overview of the fix.)

The bare outline of this code came from the plugin examples in the GCC testsuite. I should point out that it handles the probably rare case when multiple user attributes have been applied to one function declaration ie __attribute__((user("bleh"), user("alwayszero"), user("blah")))

One neat thing about creating a separate pass with GCC is that by applying the TODO_dump_func flag you get a whole bunch of debugging information (using -fdump-tree-all) including the state of the IR after your pass. For example, one attempt at writing the plugin left my test code in this state:

main ()
{
  int x;
  struct Blah b;

<bb 2>:
  b._vptr.Blah = &_ZTV4Blah[2];
  x_4 = Blah::blah (&b);
  x_4 = 0;
  x_5 = x_4;
  printf (&"%d\n"[0], x_5);
  return 0;
}

This is no good because I’m violating SSA. An earlier attempt left the IR in this state:

 
main ()
{
  int x;
  struct Blah b;

<bb 2>:
  b._vptr.Blah = &_ZTV4Blah[2];
  0 = Blah::blah (&b);
  x_5 = x_4;
  printf (&"%d\n"[0], x_5);
  return 0;
}

Important lesson: 0 is not an lvalue.

If you’re curious here’s the result of the final version of my pass (with a bit more info included):

;; Function main (main)

Symbols to be put in SSA form

{ dummy_var.0 }

Incremental SSA update started at block: 0

Number of blocks in CFG: 3
Number of blocks to update: 2 ( 67%)

main ()
{
  int dummy_var.0;
  int x;
  struct Blah b;

<bb 2>:
  b._vptr.Blah = &_ZTV4Blah[2];
  dummy_var.0_1 = Blah::blah (&b);
  x_4 = 0;
  x_5 = x_4;
  printf (&"%d\n"[0], x_5);
  return 0;
}

As an aside, I’m quite lucky to have found the make_rename_temp function because without it a hand crafted gimple call was one thing I was considering (quite a bit of madness is required in building the (arbitrary sized) argument list which is required for the gimple_build_call function).

Anyway, I guess the next step is a Dehydra script to check that every function to which this is applied really returns 0 (because if they don’t they will). A DXR search for NS_OK shows that quite a few functions can have this applied. Noting the results of tests with this optimization in my last post, I’d hope quite a bit of cruft can be eliminated.

Oh, one other irrelevant thing. I wonder if it was considered that bool plugin_is_GPL_compatible = false; is an acceptable definition to enable a plugin?

Btw, the following makefile is required to build the plugin:

GCC=~/gcc/dist/bin/gcc
PLUGIN_SOURCE_FILES= alwayszero_plugin.c
PLUGIN_OBJECT_FILES= $(patsubst %.c,%.o,$(PLUGIN_SOURCE_FILES))
GCCPLUGINS_DIR:= $(shell $(GCC) -print-file-name=plugin)
CFLAGS+= -I$(GCCPLUGINS_DIR)/include -fPIC -O2

alwayszero_plugin.so: $(PLUGIN_OBJECT_FILES)
	$(GCC) -shared $^ -o $@

(taken pretty much verbatim from the internals manual)

GCC should point to GCC 4.5. To enable the plugin use the GCC 4.5 flag:
-fplugin=/Path/To/alwayszero_plugin.so -O2

One more thing… it might be a bit silly to keep using the GNU coding style but I’ve grown a bit fond of it. For example, you can find a function definition immediately using grep '^function_name' *. Maybe I’m just going a bit crazy, however.

8 Responses

Subscribe to comments with RSS.

  1. roy_hu said, on April 16, 2010 at 3:22 am

    Considering I know nothing about gcc internals, this must be a silly question. Why create a dummy variable that holds the return value of the function call, and leave it for later passes? Why cannot you just call the function, or even better, remove the function call (assuming it’s side-effect free)?

    • ehren said, on April 16, 2010 at 4:51 am

      The idea was to trick the compiler into recognizing that the return value of certain virtual member functions is always zero (something that normally cannot be determined at compile time). If the return value of these functions is checked eg

      ret = Blah::virtualfunc();
      if (!ret)
      // error handling

      then the checks and error handling will be removed. The unused dummy variably will also be removed making the code look like this:

      Blah::virtualfunc();
      ret = 0;

      Doing this is not always a win, though, as is described in later posts.

      Looking for side effect free functions (that aren’t there just to implement some interface) is another matter.

      • roy_hu said, on April 16, 2010 at 5:19 pm

        My question is, why not transform the code from:

        ret = Blah::virtualfunc();

        to:
        Blah::virtualfunc();
        ret = 0;

        in a single pass? Why create a dummy variable to hold the return value?

      • ehren said, on April 16, 2010 at 6:27 pm

        You can’t just set the lhs to null (as mentioned). There may be some gcc macro to do it but I never found one. The only other way I can think of doing what you’re suggesting is to build a new gimple call based on the rhs of the old call. That means dealing with a variable number of arguments etc etc. Even if there is an easy way it would still be more complicated than this. Keep in mind that all of my code is located in a single (plugin) pass and that removing the dummy var is the result of gcc magic somewhere else.

  2. kd said, on June 16, 2010 at 4:00 pm

    If I have to dump the final most gimple in ssa form (after all optimizations) using a plugin like the one described above, where would I need to insert such a pass? Also, will marking a TODO item in the pass descriptor (more specifically, TODO_dump_func) be unsafe for other optimization passes to follow?

    • ehren said, on July 6, 2010 at 4:17 am

      Hi kd,
      Very sorry for not replying sooner. For a good example of an SSA pass plugin check out gcc/testsuite/gcc.dg/plugin/selfassign.c which inserts itself after pass “ssa”. IIRC, without enabling optimization gcc may not bother to convert the IR into SSA. As to the dumping of individual instructions, take a look at print-tree.c.

      I would also assume that inserting an extra pass via a plugin will not have any effect on the generated code. Setting a dump flag in particular should not be a problem. However, without enabling optimization you may not have access to the IR in SSA form.

      Let me know if I can be of any other help,

      Ehren

      • kd said, on July 6, 2010 at 6:00 am

        Thank you. If I have any further questions, I will write to you.

        Thanks once again.

  3. linuxqwerty said, on July 16, 2015 at 5:14 am

    is there any away to identify the output of compilation after each stage of GIMPLE, SSA, RTL?


Leave a comment