Ehren's Blog

A Gimple Call Flags Mystery

Posted in Seneca by ehren on October 21, 2009

It’s been a while since I’ve presented any updates with my gcc alwayszero attribute project (see previous post here). At this point I’ve identified one place where, if several modifications are made, I am certain that the proper semantics for the attribute can be achieved. There is another option more akin to the rv = call(); --> rv = 0; call(); idea, but I’ll discuss that further below.

More than a week ago I had hit on the idea of placing my optimization in tree-ssa-ccp.c. To quote directly from the comments in this file:

Constant assignments of the form VAR = CST are propagated from the assignments into uses of VAR, which in turn may generate new constants.

I think if I go too far into my fairly murky understanding of this stuff I won’t be able to finish the post, so I’ll just launch into my first idea for the optimization.

During the ccp pass (I suppose multiple “passes” take place), values associated with SSA names (ie the unique singly assigned variable names which can be viewed with -fdump-tree-ssa) are tagged as one of:

/* Possible lattice values.  */
typedef enum
{
  UNINITIALIZED,
  UNDEFINED,
  CONSTANT,
  VARYING
} ccp_lattice_t;

Statments marked VARYING cannot have their outgoing edges simulated further to root out any obvious values. A value corresponding to a name marked CONSTANT however will be propagated to other statements which depend solely on this value.

In several places, this pass checks whether a statement is a value returning call ie if is_gimple_call (stmt) && gimple_call_lhs (stmt) != NULL_TREE. My thought is that at some point during the ccp pass I should check if the call flags for such a statement contain the alwayszero flag and if so set the lhs of the assignment to 0.

I still have to determine how to actually set this value but I should note that (at least in the context of this pass) the value corresponding to an SSA_NAME is represented using a prop_value_t. Looking in tree-ssa-propagate.h we find the definition:

struct prop_value_d {
    /* Lattice value.  Each propagator is free to define its own
       lattice and this field is only meaningful while propagating.
       It will not be used by substitute_and_fold.  */
    unsigned lattice_val;

    /* Propagated value.  */
    tree value;
};

typedef struct prop_value_d prop_value_t;

Perhaps there is a better way, but my first thought would be to somehow define a new tree representing the constant zero and then pass a new prop_value_t with value this new tree (and also lattice_val CONSTANT), to the set_lattice_value (tree var, prop_value_t new_val) function found in tree-ssa-ccp.

Before I get ahead of myself though the first step would be recognizing a value returning gimple call in tree-ssa-ccp. This is extremely simple (I mentioned the condition above) but for some time I was confounded that no gimple calls were being recognized any time I’d compile any of my own test examples (I’m printing to stderr everywhere in gcc). The weird thing though was that these printfs were showing up in the compilation of gcc itself (even though I’m using the system’s gcc in the compile). I suspected that the functions in my test code were being inlined and perhaps optimized out in some other way before even reaching the ccp stage. I did hit upon the idea of simply compiling the function separately and then linking it into another optimized compile but this didn’t quite work (I was able to recognize my call flag from other places using this method though).

Eventually I decided that if gcc recognized gimple calls from tree-ssa-ccp while compiling itself (the systems compiler must only be used in the preliminary stages) why don’t I just try adding my alwayszero attribute to a whole bunch of functions within gcc? Well, it turns out that a lot of the front end including the attribute recognition stuff depends on the system compiler so that was a no go. So, why not compile my own modified gcc with a build of itself? Seems logical to me although I found that I couldn’t just add my attribute everywhere without causing other problems. Anyway, I decided to ditch this crazy effort and just take Taras’s advice to use a virtual member function. It turns out I only need an extremely simple example to tickle the propagator in the right places:

/* blah.h */
class Blah {
 public:
  virtual int blah() __attribute__((alwayszero));
};

/* blah.cpp */
#include "blah.h"
int Blah::blah() {
  return 0;
}

/* main.cpp */
#include "blah.h"
int main() 
{
  Blah b;
  Blah* p = &b;
  int x;
  x = p->blah();
  return 0;
}

Anyway, the next step is to check whether a value returning gimple call has the alwayszero flag. This should be very simple since it just means recognizing a nonzero value for gimple_call_flags (stmt) & ECF_ALWAYSZERO. So, in the same places where I added printfs to check for a gimple call, I also checked the flag values. And now, as before, I get the proper result when compiling gcc but not when compiling my own test code.

Here’s a bit of build output from gcc’s compilation:

tree-ssa-ccp.c: ccp_visit_stmt: gimple_call found
flags: 112, ECF_ALWAYSZERO: 1024
tree-ssa-ccp.c: visit_assignment: gimple_call found
flags: 112, ECF_ALWAYSZERO: 1024
tree-ssa-ccp.c: likely_value: gimple_call found
flags: 112, ECF_ALWAYSZERO: 1024
tree-ssa-ccp.c: evaluate_stmt: gimple_call found
flags: 112, ECF_ALWAYSZERO: 1024
tree-ssa-ccp.c: ccp_fold: gimple_call found
flags: 112, ECF_ALWAYSZERO: 1024
tree-ssa-ccp.c: get_default_value: gimple_call found
flags: 112, ECF_ALWAYSZERO: 1024

And here’s output from the compilation of my Blah class test program above:

$ ~/gcc/dist/bin/g++ main.cpp blah.cpp -O2
tree-ssa-ccp.c: ccp_visit_stmt: gimple_call found
flags: 0, ECF_ALWAYSZERO: 1024
tree-ssa-ccp.c: visit_assignment: gimple_call found
flags: 0, ECF_ALWAYSZERO: 1024
tree-ssa-ccp.c: likely_value: gimple_call found
flags: 0, ECF_ALWAYSZERO: 1024
tree-ssa-ccp.c: evaluate_stmt: gimple_call found
flags: 0, ECF_ALWAYSZERO: 1024
tree-ssa-ccp.c: ccp_fold: gimple_call found
flags: 0, ECF_ALWAYSZERO: 1024
tree-ssa-ccp.c: get_default_value: gimple_call found
flags: 0, ECF_ALWAYSZERO: 1024
alwayszero recognized in tree-optimize.c (execute_fixup_cfg)

Note the last line however. Using the test expression I outlined above, it is possible to recognize my attribute from an unrelated area in gcc. For some reason though, at least with this example, my flag value is not getting to the ccp pass. This is not a problem with my attribute itself since I’ve also tried compiling my Blah example using the attribute nothrow (bit 6) which clearly can be recognized at this level in principle since it is present in my example gcc build output posted above.

I did consider that the options for the ccp pass are set to automatically quash attributes so I tried adding PROP_gimple_any (signifies full gimple grammar) to the ‘properties required’ bitflag in the pass_ccp options specifier:

struct gimple_opt_pass pass_ccp = 
{
 {
  GIMPLE_PASS,
  "ccp",        /* name */
  gate_ccp,       /* gate */
  do_ssa_ccp,       /* execute */
  NULL,         /* sub */
  NULL,         /* next */
  0,          /* static_pass_number */
  TV_TREE_CCP,        /* tv_id */
  PROP_cfg | PROP_ssa | PROP_gimple_any,      /* properties_required  */
  0,          /* properties_provided */
  0,          /* properties_destroyed */
  0,          /* todo_flags_start */
  TODO_dump_func | TODO_verify_ssa
  | TODO_verify_stmts | TODO_ggc_collect/* todo_flags_finish */
 }
};

Naturally, this is not necessary (and doesn’t improve things) since some code (like those gcc functions) does allow an attribute to get through to the ccp pass.

At this point I’m somewhat stumped but I’d like to discuss some alternatives unrelated to the problem outlined above. Actually that last code example I posted is an example of how easy it is to define a new gimple optimization pass in gcc. This struct (from tree-pass.h) offers some idea of the info required:

struct register_pass_info
{
  struct opt_pass *pass;            /* New pass to register.  */
  const char *reference_pass_name;  /* Name of the reference pass for hooking
                                       up the new pass.  */
  int ref_pass_instance_number;     /* Insert the pass at the specified
                                       instance number of the reference pass.
                                       Do it for every instance if it is 0.  */
  enum pass_positioning_ops pos_op; /* how to insert the new pass.  */
};

A call is then made to extern void register_pass (struct register_pass_info *). If I didn’t want to involve myself with the propagator at all, I could take more of a syntactic approach, i.e. in a new pass I could simply null out the lhs of a value returning gimple call and then insert a new block before the gimple call’s block that contains rv = 0. The code for this may even look something like the block manipulation code found in tree-propagate.c. This route seems more complicated however (and I think will require a cfg fixup). It might also be possible to try the propagator based optimization I described above in a new pass (before the ccp pass perhaps), however I would think grafting it into tree-ssa-ccp.c should be my first step.

Anyway, for the moment I’m going to continue trying to get my attribute recognized from tree-ssa-ccp.c since I suspect it’s again only a matter of using too simple test code.

On another note I might have to start using gdb on gcc which will likely involve using emacs as is described here. The horror.

* Edit: A gcc dev pointed me in the right direction (I think) for fixing the attribute problem. Details later.
* Edit2: Fix not as straightforward as I thought.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: