<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Ehren&#039;s Blog</title>
	<atom:link href="http://ehren.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://ehren.wordpress.com</link>
	<description>Ehren Metcalfe - 6th Semester BSD student taking DPS911: Open Source Project</description>
	<lastBuildDate>Wed, 04 Jan 2012 17:02:01 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='ehren.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Ehren&#039;s Blog</title>
		<link>http://ehren.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://ehren.wordpress.com/osd.xml" title="Ehren&#039;s Blog" />
	<atom:link rel='hub' href='http://ehren.wordpress.com/?pushpress=hub'/>
		<item>
		<title>RTL level function removal</title>
		<link>http://ehren.wordpress.com/2010/04/21/rtl-level-function-removal/</link>
		<comments>http://ehren.wordpress.com/2010/04/21/rtl-level-function-removal/#comments</comments>
		<pubDate>Thu, 22 Apr 2010 04:38:10 +0000</pubDate>
		<dc:creator>ehren</dc:creator>
				<category><![CDATA[Seneca]]></category>

		<guid isPermaLink="false">http://ehren.wordpress.com/?p=496</guid>
		<description><![CDATA[Over the past few days I&#8217;ve been focusing on getting the call graph portion of my dead code analysis in check. It turns out that function local const (or static) initializations are not accessible during later gcc passes. Luckily, walking the front end tree representation, which is accessible via Treehydra&#8216;s process_cp_pre_genericize, does the trick. This [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=496&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Over the past few days I&#8217;ve been focusing on getting the call graph portion of my dead code analysis in check. It turns out that function local const (or static) initializations are not accessible during later gcc passes. Luckily, walking the front end tree representation, which is accessible via <a href="https://developer.mozilla.org/en/Treehydra">Treehydra</a>&#8216;s process_cp_pre_genericize, does the trick. This takes care of all remaining false positives of which I am aware.  </p>
<p>The downside is that going through all these extra tree codes is sloooow. After a bunch of false starts the damn thing is still running on one of the <a href="http://cdot.senecac.on.ca/">CDOT</a>&#8216;s development machines (probably an 8+ hour compile).</p>
<p>For a little while though, I&#8217;ve been thinking of ways to automatically remove identified dead code without actually patching the source. By applying Taras&#8217; <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=547581">assembler name patch</a> to dehydra I can now identify precisely which code to remove. The question now is how to remove it.</p>
<p>My first thought was a hack using <a href="http://sourceware.org/binutils/docs/binutils/objcopy.html">objcopy</a>. I could first output a list of the mangled names of dead functions and then, after running a build with <a href="http://gcc.gnu.org/onlinedocs/gcc-4.0.4/gcc/Optimize-Options.html#index-ffunction_002dsections-545">-ffunction-sections</a>, run a script like this:</p>
<p><pre class="brush: bash;">
#!/bin/sh
while read asmname
do
  find objdir -name &quot;*.o&quot; | xargs objcopy --remove-section=&quot;.text.$asmname&quot;
done &lt; asmlist.txt
</pre></p>
<p>(and then relinking). This works, but only for non-member functions. </p>
<p>The other option was some sort of gcc hack to remove as much code as possible when obcopy can&#8217;t do the job. I first tried removing every statement in every basic block of the function (see <a href="http://gcc.gnu.org/onlinedocs/gccint/Sequence-iterators.html#Sequence-iterators">gsi_remove</a>). This seems to only work with a non-branching cfg however (even when I leave a valid return statement). I then tried <a href="http://gcc.gnu.org/viewcvs/trunk/gcc/cgraph.c?revision=158607&amp;view=markup#l1288">cgraph_remove_node</a> with an IPA pass plugin which blows up if a function&#8217;s referenced anywhere else.</p>
<p>Today I arrived at a solution that, although requiring a direct patch of GCC, seems to be ideal. Surprisingly, it&#8217;s possible to hook in right before assembly generation, and it&#8217;s easy too:</p>
<p><pre class="brush: cpp;">
diff --git a/gcc/final.c b/gcc/final.c
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -4090,16 +4090,34 @@
 {
   if (symbol_queue)
     {
       free (symbol_queue);
       symbol_queue = NULL;
       symbol_queue_size = 0;
     }
 }
+
+static bool
+is_dead(const char* name) 
+{
+  char asmname[100];
+  FILE* fp = fopen(&quot;/home/ehren/asmlist.txt&quot;, &quot;r&quot;);
+  if (!fp)
+    return false;
+
+  while (fscanf(fp, &quot;%s&quot;, asmname) != EOF) {
+    if (strcmp(asmname, name) == 0) {
+      fclose(fp);
+      return true;
+    }
+  }
+  fclose(fp);
+  return false;
+}
 
 /* Turn the RTL into assembly.  */
 static unsigned int
 rest_of_handle_final (void)
 {
   rtx x;
   const char *fnname;
 
@@ -4109,17 +4127,19 @@
   x = DECL_RTL (current_function_decl);
   gcc_assert (MEM_P (x));
   x = XEXP (x, 0);
   gcc_assert (GET_CODE (x) == SYMBOL_REF);
   fnname = XSTR (x, 0);
 
   assemble_start_function (current_function_decl, fnname);
   final_start_function (get_insns (), asm_out_file, optimize);
-  final (get_insns (), asm_out_file, optimize);
+  if (!is_dead(fnname)) {
+    final (get_insns (), asm_out_file, optimize);
+  }
   final_end_function ();
 
 #ifdef TARGET_UNWIND_INFO
   /* ??? The IA-64 &quot;.handlerdata&quot; directive must be issued before
      the &quot;.endp&quot; directive that closes the procedure descriptor.  */
   output_function_exception_table (fnname);
 #endif
</pre></p>
<p>With this, functions are removed completely (except from the symbol table) and the bodies of virtuals are replaced with a couple words worth of NOPs. Yes, opening up the file with a hardcoded path for every function is ugly but later on I could always read it into a global somewhere else (and do a binary search).</p>
<p>The only downside here is I won&#8217;t get any link time errors if a few false positives slip through, as opposed to with objcopy. In my experiments, calling a function that doesn&#8217;t exist results in an immediate segmentation fault (makes sense), but storing the return of a NOP-body virtual just leaves you with an uninitialized value.</p>
<p>Hopefully, I&#8217;ll soon have some good results on the analysis front to actually test this on Mozilla.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ehren.wordpress.com/496/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ehren.wordpress.com/496/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ehren.wordpress.com/496/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ehren.wordpress.com/496/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ehren.wordpress.com/496/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ehren.wordpress.com/496/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ehren.wordpress.com/496/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ehren.wordpress.com/496/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ehren.wordpress.com/496/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ehren.wordpress.com/496/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ehren.wordpress.com/496/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ehren.wordpress.com/496/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ehren.wordpress.com/496/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ehren.wordpress.com/496/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=496&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ehren.wordpress.com/2010/04/21/rtl-level-function-removal/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/036e871fb77da20c503fb7438ca5ca6a?s=96&#38;d=identicon&#38;r=R" medium="image">
			<media:title type="html">ehren</media:title>
		</media:content>
	</item>
		<item>
		<title>Dead code progress</title>
		<link>http://ehren.wordpress.com/2010/04/07/dead-code-progress/</link>
		<comments>http://ehren.wordpress.com/2010/04/07/dead-code-progress/#comments</comments>
		<pubDate>Thu, 08 Apr 2010 02:04:15 +0000</pubDate>
		<dc:creator>ehren</dc:creator>
				<category><![CDATA[Seneca]]></category>

		<guid isPermaLink="false">http://ehren.wordpress.com/?p=479</guid>
		<description><![CDATA[So far things are on track with my attempts to developed an unused function finding tool. Now that the function pointer/jump table problem has been solved other more subtle issues have come to light. The first was a problem with callgraph&#8216;s handling of inheritance chains. As I mentioned previously, it was necessary to add each [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=479&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>So far things are on track with my attempts to developed an unused function finding tool. Now that the function pointer/jump table problem <a href="http://ehren.wordpress.com/2010/04/06/function-declaration-escape-analysis-v2/">has been solved</a> other more subtle issues have come to light.</p>
<p>The first was a problem with <a href="https://developer.mozilla.org/en/Callgraph">callgraph</a>&#8216;s handling of inheritance chains. As I <a href="http://ehren.wordpress.com/2010/04/04/function-declaration-escape-analysis/">mentioned previously</a>, it was necessary to add each method to the node table (see <a href="http://https://developer.mozilla.org/En/Callgraph/Schema_Reference">schema reference</a>) both when the method&#8217;s body is processed (as is already the case) but also when the method&#8217;s type is processed. At some point I should really develop some tests here but this affects the recognition of pure virtual functions in a number of complicated cases.</p>
<p>However, I also ran into another issue where certain methods were not finding themselves into the inheritance chains in which they belong. This seems to be only when a virtual function overrides a base class function that has not been defined in the &#8216;next up&#8217; base class (A defines virtual foo, B derives from A, C derives from B and redefines virtual foo). This could be a <a href="http://https://developer.mozilla.org/en/Treehydra">Treehydra</a> issue or a maybe a problem with the GCC <a href="http://gcc.gnu.org/onlinedocs/gccint/Classes.html">binfo</a> data structure (or maybe I&#8217;m just misunderstanding things).</p>
<p>Either way, my solution has been to process all base classes and subclasses of a method both when the type is processed and also when the method body is processed. This appears to be a working solution (although it certainly does not improve callgraph compilation times). </p>
<p>Once this was handled I started to get some pretty good results but I noticed scriptable methods were getting into the mix. After a few hours of fruitless hacking it turned out I just forgot to properly define <a href="http://mxr.mozilla.org/mozilla-central/source/xpcom/base/nscore.h#518">NS_SCRIPTABLE</a> (since I&#8217;m not running a <a href="https://developer.mozilla.org/en/Building_with_static_checking">&#8211;with-static-checking</a> build). After rebuilding again, I believe I finally attained a 0% false positive rate. </p>
<p>This time I hit another problem though. A bunch of genuinely dead methods turned up by my most recent (defective) analysis were not showing up. In fact, very few methods were showing up at all. Investigating, it turns out I&#8217;ve been quite overzealous in marking a method as scriptable. My previous technique was to check if <code>__attribute__((user("NS_script"))</code> was present in the declaration attributes of a function and also to check if it is present in the type attributes of the class and <em>any base class</em>. This excludes a bunch of juicy dead stuff like <a href="http://mxr.mozilla.org/mozilla-central/search?string=SetCharsetCheckmark">nsCharsetMenu::SetCharsetCheckmark</a> (gimple isn count 75 ftw) which is a member of a non-scriptable class that derives from two scriptable interfaces (which do not declare SetCharsetCheckmark).</p>
<p>Naturally, the solution when marking methods scriptable because of their base classes is only to mark the method scriptable when the base class declares the method and is scriptable. Come to think of it, I probably don&#8217;t even need to do this because of the way I group together base and derived methods.</p>
<p>Anyway, my current status is waiting for a build with these changes to finish. We will see if there are more issues.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ehren.wordpress.com/479/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ehren.wordpress.com/479/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ehren.wordpress.com/479/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ehren.wordpress.com/479/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ehren.wordpress.com/479/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ehren.wordpress.com/479/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ehren.wordpress.com/479/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ehren.wordpress.com/479/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ehren.wordpress.com/479/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ehren.wordpress.com/479/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ehren.wordpress.com/479/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ehren.wordpress.com/479/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ehren.wordpress.com/479/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ehren.wordpress.com/479/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=479&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ehren.wordpress.com/2010/04/07/dead-code-progress/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/036e871fb77da20c503fb7438ca5ca6a?s=96&#38;d=identicon&#38;r=R" medium="image">
			<media:title type="html">ehren</media:title>
		</media:content>
	</item>
		<item>
		<title>Function declaration escape analysis v2</title>
		<link>http://ehren.wordpress.com/2010/04/06/function-declaration-escape-analysis-v2/</link>
		<comments>http://ehren.wordpress.com/2010/04/06/function-declaration-escape-analysis-v2/#comments</comments>
		<pubDate>Tue, 06 Apr 2010 07:46:45 +0000</pubDate>
		<dc:creator>ehren</dc:creator>
				<category><![CDATA[Seneca]]></category>

		<guid isPermaLink="false">http://ehren.wordpress.com/?p=467</guid>
		<description><![CDATA[I don&#8217;t want to get too excited until I&#8217;ve run this through 4000000 loc but I believe I&#8217;ve solved the problem of being unable to process global initializations of const/static global variables. Earlier, I posted a message to the GCC mailing list describing my troubles with varpool. I did receive a helpful response in that [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=467&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I don&#8217;t want to get too excited until I&#8217;ve run this through 4000000 loc but I believe I&#8217;ve solved the problem of being <a href="http://ehren.wordpress.com/2010/04/05/problems-with-const-static-initializations/">unable</a> to process global initializations of const/static global variables. Earlier, I posted a <a href="http://gcc.gnu.org/ml/gcc/2010-04/msg00056.html">message</a> to the GCC mailing list describing my troubles with <code>varpool</code>. I did receive a helpful response in that there is nothing inherent about const/static declarations that would prevent one from getting at the lhs of their initialization.</p>
<p>Today I experimented with walking as many trees in as many places as I could find without much luck. I then tried compiling this sample code with -Wunused:</p>
<p><pre class="brush: cpp;">
int foo() {
  return 0;
}

typedef struct {
  int (*p) ();
} Table;

static Table t[] = {
  { foo }
};
</pre></p>
<p>As expected, GCC warns about the unused static variable. </p>
<p>Taking a look at <a href="http://gcc.gnu.org/viewcvs/trunk/gcc/toplev.c?view=markup">toplevel.c</a> It didn&#8217;t take too long to find a solution (and this works for const, static, and const static):</p>
<p><pre class="brush: cpp;">
Index: gcc/toplev.c
===================================================================
--- gcc/toplev.c  (revision 157978)
+++ gcc/toplev.c  (working copy)
@@ -844,12 +844,26 @@
   return output_something;
 }

+static tree 
+find_funcs_callback(tree *tp, int *walk_subtrees, void *data)
+{
+  tree t = *tp;
+
+  if (TREE_CODE(t) == FUNCTION_DECL)
+    fprintf(stderr, &quot;address held: %s\n&quot;, IDENTIFIER_POINTER(DECL_NAME(t)));
+
+  return NULL_TREE;
+}
+
 /* A subroutine of check_global_declarations.  Issue appropriate warnings
    for the global declaration DECL.  */

 void
 check_global_declaration_1 (tree decl)
 {
+  if (DECL_INITIAL(decl))
+    walk_tree(&amp;DECL_INITIAL(decl), find_funcs_callback, NULL, NULL);
+
   /* Warn about any function declared static but not defined.  We don't
      warn about variables, because many programs have static variables
      that exist only to get some text into the object file.  */
</pre></p>
<p>I&#8217;ve got to fix up the output to match <a href="https://developer.mozilla.org/en/Callgraph">callgraph</a>&#8216;s serialization but <em>this could be it</em>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ehren.wordpress.com/467/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ehren.wordpress.com/467/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ehren.wordpress.com/467/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ehren.wordpress.com/467/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ehren.wordpress.com/467/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ehren.wordpress.com/467/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ehren.wordpress.com/467/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ehren.wordpress.com/467/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ehren.wordpress.com/467/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ehren.wordpress.com/467/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ehren.wordpress.com/467/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ehren.wordpress.com/467/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ehren.wordpress.com/467/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ehren.wordpress.com/467/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=467&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ehren.wordpress.com/2010/04/06/function-declaration-escape-analysis-v2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/036e871fb77da20c503fb7438ca5ca6a?s=96&#38;d=identicon&#38;r=R" medium="image">
			<media:title type="html">ehren</media:title>
		</media:content>
	</item>
		<item>
		<title>Problems with const static initializations</title>
		<link>http://ehren.wordpress.com/2010/04/05/problems-with-const-static-initializations/</link>
		<comments>http://ehren.wordpress.com/2010/04/05/problems-with-const-static-initializations/#comments</comments>
		<pubDate>Mon, 05 Apr 2010 05:02:44 +0000</pubDate>
		<dc:creator>ehren</dc:creator>
				<category><![CDATA[Seneca]]></category>

		<guid isPermaLink="false">http://ehren.wordpress.com/?p=460</guid>
		<description><![CDATA[Unfortunately I spoke too soon about developing an airtight dead code finder. The technique of processing file scope variables I mentioned in my previous post has a serious drawback: it doesn&#8217;t work for const static data. This is a show stopper when it comes to peeking into most jump tables. I&#8217;ve been able to print [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=460&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Unfortunately I spoke too soon about developing an airtight dead code finder. The technique of processing file scope variables I mentioned in my <a href="http://ehren.wordpress.com/2010/04/04/function-declaration-escape-analysis/">previous post</a> has a serious drawback: it doesn&#8217;t work for const static data. This is a show stopper when it comes to peeking into most jump tables.</p>
<p>I&#8217;ve been able to print type information for all globals using the dehydra hooks placed into c-common.c however it seems like const initializations are not even handled at this level. I have my suspicions that there&#8217;s no way to recover the FUNCTION_DECL node in this case, likely because gcc has no use for the info at this level.</p>
<p>Although I may be able to make do by simply manually filtering as many callback functions as I can this approach is not quite ideal. I&#8217;ll have to think more about this but I&#8217;m now thinking that the <a href="http://gcc.gnu.org/wiki/LinkTimeOptimization">lto</a> streamer might be of use. There&#8217;s also the chance that there&#8217;s another way of using the cgraph to get at this data.</p>
<p>The other possibility is ditching gcc entirely and using elsa to dump the data. I&#8217;ll report back when I know more.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ehren.wordpress.com/460/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ehren.wordpress.com/460/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ehren.wordpress.com/460/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ehren.wordpress.com/460/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ehren.wordpress.com/460/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ehren.wordpress.com/460/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ehren.wordpress.com/460/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ehren.wordpress.com/460/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ehren.wordpress.com/460/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ehren.wordpress.com/460/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ehren.wordpress.com/460/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ehren.wordpress.com/460/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ehren.wordpress.com/460/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ehren.wordpress.com/460/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=460&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ehren.wordpress.com/2010/04/05/problems-with-const-static-initializations/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/036e871fb77da20c503fb7438ca5ca6a?s=96&#38;d=identicon&#38;r=R" medium="image">
			<media:title type="html">ehren</media:title>
		</media:content>
	</item>
		<item>
		<title>Function declaration escape analysis</title>
		<link>http://ehren.wordpress.com/2010/04/04/function-declaration-escape-analysis/</link>
		<comments>http://ehren.wordpress.com/2010/04/04/function-declaration-escape-analysis/#comments</comments>
		<pubDate>Sun, 04 Apr 2010 15:13:31 +0000</pubDate>
		<dc:creator>ehren</dc:creator>
				<category><![CDATA[Seneca]]></category>

		<guid isPermaLink="false">http://ehren.wordpress.com/?p=442</guid>
		<description><![CDATA[It&#8217;s been quite a while since I&#8217;ve blogged. Frankly I&#8217;ve got behind with my work. However, I believe I have hit a huge breakthrough finding dead code. I don&#8217;t want to get too ahead of myself but if things work as I think, I can now identify every unused function/member function in mozilla-central with a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=442&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s been quite a while since I&#8217;ve blogged. Frankly I&#8217;ve got behind with my work. However, I believe I have hit a huge breakthrough finding dead code. I don&#8217;t want to get too ahead of myself but if things work as I think, I can now identify every unused function/member function in mozilla-central with a near 0 false positive rate.</p>
<p>I&#8217;ve mentioned many times that assignments to function pointers are a huge pain when it comes to recognizing call graph edges. I&#8217;ve been able to handle function local address taking for some time now but the problem of functions referenced in global variables (usually jump tables) has proved elusive. In fact, it&#8217;s currently not possibly to process global variables at all with <a href="https://developer.mozilla.org/en/Treehydra">Treehydra</a>. This leads to thousands of false positives in the analysis and a bunch of special case handling.</p>
<p>It suddenly dawned on me that GCC might have already done the work for me. <a href="http://gcc.gnu.org/viewcvs/trunk/gcc/ipa-type-escape.c?revision=157950&amp;view=markup">ipa-type-escape.c</a>, which &#8220;determines which types in the program contain only instances that are completely encapsulated by the compilation unit&#8221; seems to fit the bill, but I ended up with less than stellar results trying to print out any escaping function declarations. In fact, I don&#8217;t think it really is useful for my purposes.</p>
<p>However, the technique of processing global variables using the <code>varpool</code> is exactly what I needed. In fact, I can very easily write a GCC plugin to print off all the globally escaping declarations in a compilation unit. Unfortunately, getting a plugin to build that uses more than the standard set of routines is a bit of a challenge (more stuff needs to be linked in) so I just hacked it into <a href="http://hg.mozilla.org/rewriting-and-analysis/dehydra/file/d039748e7775/dehydra_plugin.c#l301">dehydra_plugin.c</a>. It works though!</p>
<p>Here&#8217;s the code:</p>
<p><pre class="brush: cpp;">
static tree find_funcs_callback(tree *tp, int *walk_subtrees, void *data) {
  tree t = *tp;
  
  if (TREE_CODE(t) == FUNCTION_DECL) {
    // dump function (I use dehydra specific code)
  }
  
  return NULL_TREE;
}
 
static void find_funcs(tree decl) {
  walk_tree(&amp;decl, find_funcs_callback, NULL, NULL);
}

// This needs to go in the execute function of an IPA pass.
// I just stuck it into dehydra's gcc_plugin_post_parse
struct varpool_node *vnode;
FOR_EACH_STATIC_VARIABLE(vnode)
  find_funcs(DECL_INITIAL(vnode-&gt;decl));
</pre></p>
<p>Now all I have to do is mark every function printed by this routine as escaping. I&#8217;ve been able to match <a href="https://developer.mozilla.org/en/Callgraph">callgraph</a>&#8216;s serialization almost completely so this will be a breeze. </p>
<p>I&#8217;ve also found that the current way callgraph treats inheritance chains (using a <a href="https://developer.mozilla.org/En/Callgraph/Schema_Reference">table</a> of &#8216;implementor&#8217; &#8211; &#8216;interface&#8217; pairs) is not particularly useful for finding dead code. In fact, a whole bunch of functions in the implementors table are being left out of the node table. I&#8217;ve been able to rectify this by treating method overriding just like any call edge. In particular, I have the base class &#8216;call&#8217; the derived class which will fit right into my existing algorithm (becuase of dynamic dispatch, once the base method is called all bets are off on whether or not some derived method is called). In order to get any results previously, I&#8217;ve just been identifying base method-derived method pairs by textually matching the prototypes. Looking at my current results, I&#8217;ve found a disproportionate number of static methods which suggests that this technique is too conservative. Here&#8217;s the <a href="http://matrix.senecac.on.ca/~egmetcalfe/path-restrictive.py">old script</a> btw. With these callgraph chages the next one will be much simpler (with no method name parsing!) </p>
<p>I&#8217;ve had some success already with dead code in <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=551390">content</a> and particularly in <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=556446">layout</a>, even with this rudimentary script. I also certainly have <a href="http://zenit.senecac.on.ca/wiki/index.php/User:Egmetcalfe/Dead_Code">more to file</a>. With this new approach, though, I think I&#8217;ll be able able to find and file <em>all of it</em> by release 1.0.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ehren.wordpress.com/442/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ehren.wordpress.com/442/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ehren.wordpress.com/442/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ehren.wordpress.com/442/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ehren.wordpress.com/442/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ehren.wordpress.com/442/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ehren.wordpress.com/442/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ehren.wordpress.com/442/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ehren.wordpress.com/442/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ehren.wordpress.com/442/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ehren.wordpress.com/442/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ehren.wordpress.com/442/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ehren.wordpress.com/442/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ehren.wordpress.com/442/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=442&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ehren.wordpress.com/2010/04/04/function-declaration-escape-analysis/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/036e871fb77da20c503fb7438ca5ca6a?s=96&#38;d=identicon&#38;r=R" medium="image">
			<media:title type="html">ehren</media:title>
		</media:content>
	</item>
		<item>
		<title>Dead code update (and problems finding uninitialized class members)</title>
		<link>http://ehren.wordpress.com/2010/02/25/dead-code-update-and-problems-finding-uninitialized-class-members/</link>
		<comments>http://ehren.wordpress.com/2010/02/25/dead-code-update-and-problems-finding-uninitialized-class-members/#comments</comments>
		<pubDate>Thu, 25 Feb 2010 14:16:10 +0000</pubDate>
		<dc:creator>ehren</dc:creator>
				<category><![CDATA[Seneca]]></category>

		<guid isPermaLink="false">http://ehren.wordpress.com/?p=433</guid>
		<description><![CDATA[Well, I&#8217;ve fallen way behind with this blog again. Unfortunately, what I&#8217;ve done so far hasn&#8217;t worked out exactly as expected either. Anyway, here&#8217;s an update. Finding Dead Code I&#8217;ve refined the script I posted previously to take into account the fact that indirect calls to virtual functions register in g++ as calls to the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=433&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Well, I&#8217;ve fallen way behind with this blog again. Unfortunately, what I&#8217;ve done so far hasn&#8217;t worked out exactly as expected either. Anyway, here&#8217;s an update.</p>
<p><strong>Finding Dead Code</strong><br />
I&#8217;ve refined the script I posted previously to take into account the fact that indirect calls to virtual functions register in g++ as calls to the base method (among a few other things). It can be viewed <a href="http://matrix.senecac.on.ca/~egmetcalfe/path.py">here</a>. Running this new analysis, I&#8217;ve learned that finding dead methods is even more difficult than finding dead functions. Why? A tonne of stuff is exported as part of the Mozilla public api. A function or even an entire class may seem dead but it&#8217;s likely used via javascript/xpcom inside the mozilla tree (or maybe in an extension).</p>
<p>As far as I know, anything that derives from a class defined in an idl file probably can&#8217;t be marked dead and I suspect detecting this case will be most difficult. One thing I could take into account, however, are methods marked as <code>NS_SCRIPTABLE</code> ie <code>__attribute__((user("NS_script")))</code>. It would be easy to add this info to the node table in <a href="https://developer.mozilla.org/en/Callgraph">callgraph</a>. Also, in regard to plain functions, I think it would be prudent to register function local address taking (of functions).</p>
<p>On the bright side I think I&#8217;ve found a handful of unused functions (view sample output of the script showing these <a href="http://matrix.senecac.on.ca/~egmetcalfe/maybe-dead-stuff.txt">here</a>). I may be able to gleam more info once I&#8217;m able to actually query my paths table though. Unfortunately, the latest incarnation of the script is only about 85% finished:<br />
<pre class="brush: plain;">
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                         
28977 ehren     20   0 17804  12m 2132 R 100.1  0.4   2331:07 python
</pre></p>
<p>That&#8217;s CPU time!</p>
<p><strong>Uninitialized class members</strong><br />
I&#8217;ve also put a bit of work into a more immediate analysis. <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=525063">Bug 525063</a> proposes an error for any class with uninitialized data members. It seemed like an interesting project perhaps as an introduction to ESP analysis (especially since the only state transitions are assignments and passing by reference). In fact, dmandelin&#8217;s already done work on an analysis for uninitialized variables in general which I originally attempted to adapt. This was not the easiest task however, although I have just recently learned why I was not successful (I think it relates to a malfunctioning of <a href="http://hg.mozilla.org/rewriting-and-analysis/dehydra/file/d039748e7775/libs/unstable/analysis.js#l14">create_decl_set</a> which I have yet to diagnose).</p>
<p>Giving up on ESP for the moment, I tried my hand doing things the simple way. That is, if there&#8217;s any code at all in the constructor that initializes a variable, then I consider that variable initialized. </p>
<p>To accomplish this, I first wrote a <a href="http://matrix.senecac.on.ca/~egmetcalfe/uninit-old.js">rather ugly</a> and unreadable version of the script. I then <a href="http://matrix.senecac.on.ca/~egmetcalfe/uninit.js">polished it up</a> a bit. My days of using JS arrays for maps and sets is hopefully over and, in fact, I think I&#8217;d be able to take another run at an ESP analysis now. Unfortunately though, running my simple script has uncovered more fundamental problems.</p>
<p>The bug comments mention classes with an overridden <code>operator new</code> (containing a <code>memset(0)</code> to initialize the data members) as one special case but this is nowhere near the largest problem. Consider <a href="https://developer.mozilla.org/en/Adding_XPCOM_components_to_Mozilla_build_system#The_init_function">this mdc page on xpcom</a>:</p>
<blockquote><p>You can provide an initialization function for your class. This will be called immediately after your class is allocated and the constructor is called. The init function takes 0 arguments, returns an nsresult, and must be public. You can call it anything you like, just reference it from NS_GENERIC_FACTORY_CONSTRUCTOR_INIT (discussed below).
</p></blockquote>
<p>Now we have the situation of variable initialization split across multiple functions. In fact, it&#8217;s even more complicated! Here&#8217;s an example:</p>
<p><a href="http://hg.mozilla.org/mozilla-central/file/9a4b73f92f0e/editor/libeditor/html/nsHTMLEditor.h#l750">mCSSAware</a> (part of nsHTMLEditor) is one &#8216;uninitialized&#8217; member turned up by the script. Rather than being directly initialized in <code>Init()</code> (or in the constructor), the initialization takes place in <a href="http://hg.mozilla.org/mozilla-central/file/9a4b73f92f0e/editor/libeditor/html/nsHTMLEditor.h#l442">UpdateForFlags</a> which is called from <code>Init()</code> <a href="http://hg.mozilla.org/mozilla-central/file/9a4b73f92f0e/editor/libeditor/html/nsHTMLEditor.cpp#l276">here</a>. Luckily in this case the initializations are all within the same translation unit although I suspect this is not always the case.</p>
<p>So, what would be the solution? If it&#8217;s worth doing, some kind of interprocedural analysis will have to be performed. I was thinking of collecting all the &#8216;uninitialized&#8217; fields during compilation of both the constructor and any other functions identified as an inititializer. Initializers must include any member function called by an initializer (note that pass by reference is just simple initialization). If the set of uninitialized fields in a particular constructor intersected with the sets of fields from the other initializers is non-empty, you&#8217;ve found a genuine uninitialized field. </p>
<p>I was thinking that if every initialization takes place in the same translation unit it might be possible to do this within the current static-checking framework ie by compiling only once. I&#8217;d basically just have to identify the set of data members not initialized for <em>every</em> member function while building a mini call graph to eventually determine which of those member functions is really an initializer. It would then be a simple matter to take the intersection of the relevant sets of uninitialized fields. </p>
<p>However, I suspect there&#8217;s probably more than a few cases where member function definitions are spread across translation units (Actually that might be a neat dehydra analysis if this behaviour&#8217;s unwanted). To deal with this case I&#8217;d have to store the sets of &#8216;uninitialized&#8217; data members until the entire build is complete while also determining the set of initializing functions. I&#8217;d then have to determine which data members are really initialized during a post-compilation analysis.</p>
<p>It might also be possible to produce the errors while the build process is still under way if I was able to query an sqlite database from within a treehydra script. I&#8217;m under the impression that the only way to use sqlite from js is by using xpcshell though so I don&#8217;t know if this suggestion would work. You&#8217;d still have to run a callgraph build prior to compilation anyway which takes quite some time. </p>
<p>There&#8217;s also the issue of the accuracy of any analysis when the call graph will likely have to be pretty dumb ie unfeasible code in a particular function that contains a call will end up registering as an edge. I suppose it would also be necessary to relate a functions parameter&#8217;s (and any global data) to whether or not a particular call within a function is feasible or not (and so on). The ESP paper does discuss a framework for interprocedural analysis but that&#8217;s way out of my scope. It would probably still be a worthwhile analysis if this kind of stuff is not taken into account, however it kind of defeats the purpose of performing an ESP analysis to find which members are uninitialized.</p>
<p>Anyway, I&#8217;ll report back (hopefully without such a long delay) when I&#8217;ve determined the next course of action.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ehren.wordpress.com/433/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ehren.wordpress.com/433/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ehren.wordpress.com/433/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ehren.wordpress.com/433/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ehren.wordpress.com/433/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ehren.wordpress.com/433/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ehren.wordpress.com/433/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ehren.wordpress.com/433/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ehren.wordpress.com/433/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ehren.wordpress.com/433/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ehren.wordpress.com/433/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ehren.wordpress.com/433/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ehren.wordpress.com/433/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ehren.wordpress.com/433/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=433&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ehren.wordpress.com/2010/02/25/dead-code-update-and-problems-finding-uninitialized-class-members/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/036e871fb77da20c503fb7438ca5ca6a?s=96&#38;d=identicon&#38;r=R" medium="image">
			<media:title type="html">ehren</media:title>
		</media:content>
	</item>
		<item>
		<title>A static analysis for fallthrough switch cases in C++</title>
		<link>http://ehren.wordpress.com/2010/02/08/a-static-analysis-for-fallthrough-switch-cases-in-c/</link>
		<comments>http://ehren.wordpress.com/2010/02/08/a-static-analysis-for-fallthrough-switch-cases-in-c/#comments</comments>
		<pubDate>Mon, 08 Feb 2010 07:15:46 +0000</pubDate>
		<dc:creator>ehren</dc:creator>
				<category><![CDATA[Seneca]]></category>

		<guid isPermaLink="false">http://ehren.wordpress.com/?p=418</guid>
		<description><![CDATA[For my 0.5 release in DPS911 I&#8217;d like to discuss some work on a Treehydra script for detecting fallthrough cases in a switch statement. I&#8217;ve actually been working on this off and on for some time, but it&#8217;s only been within the last few hours that I&#8217;ve arrived at something airtight. Actually, I began work [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=418&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>For my 0.5 release in <a href="http://zenit.senecac.on.ca/wiki/index.php/DPS911">DPS911</a> I&#8217;d like to discuss some work on a <a href="https://developer.mozilla.org/en/Treehydra">Treehydra</a> script for detecting fallthrough cases in a switch statement. I&#8217;ve actually been working on this off and on for some time, but it&#8217;s only been within the last few hours that I&#8217;ve arrived at something airtight. </p>
<p>Actually, I began work on such a script quite a while ago, after completing similar work on <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=535646">finding unreachable blocks in the control flow graph</a>. Most of the serious bugs turned up by this analysis were all caused by forgetting the break statement (see <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=536438">bug 536438</a> and <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=536646">bug 536646</a>) so a warning/error for this seemed natural. Also, I assumed this would be easy: just find all the basic blocks corresponding to the case labels of the switch and, if the successor of a block corresponds to the next case, you&#8217;ve got a fallthrough.</p>
<p>To make this more clear, here&#8217;s a simple switch statement that falls through:</p>
<p><pre class="brush: plain;">
int foo(int x) {
  switch(x) {
    case 0:
      x++;
    case 1:
      x--;
  }
}
</pre></p>
<p>And here&#8217;s a <a href="http://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html">-fdump-tree-all</a> printing of the cfg:</p>
<p><pre class="brush: plain;">
;; Function int foo(int) (_Z3fooi)

int foo(int) (x)
{
  # BLOCK 2
  # PRED: ENTRY (fallthru)
  switch (x)
    {
      case 0: goto &lt;L0&gt;;
      case 1: goto &lt;L1&gt;;
      default : goto &lt;L2&gt;;
    }
  # SUCC: 3 4 5

  # BLOCK 3
  # PRED: 2
&lt;L0&gt;:;
  x = x + 1;
  # SUCC: 4 (fallthru)

  # BLOCK 4
  # PRED: 2 3 (fallthru)
&lt;L1&gt;:;
  x = x + -1;
  # SUCC: 5 (fallthru)

  # BLOCK 5
  # PRED: 2 4 (fallthru)
&lt;L2&gt;:;
  return;
  # SUCC: EXIT
}

</pre></p>
<p>Notice something funny? There&#8217;s now a default case label! GCC will always insert this dummy default by the time you&#8217;ve reached the <a href="http://gcc.gnu.org/onlinedocs/gccint/Control-Flow.html#Control-Flow">cfg</a> and, even worse, there&#8217;s no way to distinguish a switch statement with a real default from one without, once you&#8217;ve reached this level. Unfortunately, I spent a great deal of time trying to do just that, at various times thinking I finally had the solution only to discover a new test case that would totally blow apart the analysis. I think I&#8217;ve tried maybe a dozen different approaches only to give up in disgust at various points. See <a href="http://matrix.senecac.on.ca/~egmetcalfe/switch-OLD.js">here</a> for one totally invalid early attempt.</p>
<p>So what&#8217;s the solution? I needed information from two separate passes of the compiler: The cfg, but also the initial <a href="http://gcc.gnu.org/onlinedocs/gccint/Trees.html#Trees">c++ ast</a> representation to find out if the default was <em>really</em> a default. To this end, I knew <a href="http://hg.mozilla.org/rewriting-and-analysis/dehydra/file/d039748e7775/libs/treehydra.js#l110">walk_tree</a> would be needed. I was even able to find a <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=542276">bug</a> in Treehydra during some initial experiments. However, as before, once I thought I had the solution, a more complex test case would ruin everything.</p>
<p>Anyway, <a href="http://matrix.senecac.on.ca/~egmetcalfe/switch.js">here&#8217;s the finalish script</a> which also employs a <code>fallthrough()</code> function/annotation to suppress the warnings. Once I&#8217;ve had a bit of time to sleep on it, I&#8217;m going to refactor a bit and then post to the <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=522774">bug</a>. I also need to consider how to integrate this into a <a href="https://developer.mozilla.org/en/Building_with_static_checking">&#8211;with-static-checking</a> build. There are a number of issues I had to take into account that I haven&#8217;t mentioned, as well, such as when the order of the cases is mixed up (eg case 1 before case 0) and also when labels and gotos come into the mix. (Properly dealing with switches embedded within switches &#8230; was the last hurdle.) </p>
<p>For a demonstration, <a href="http://matrix.senecac.on.ca/~egmetcalfe/switch.js.example.txt">here&#8217;s</a> a nasty bit of code together with a sample run of the script.</p>
<p>So, for my 0.6 release, I&#8217;ll have to get back to dead code, hopefully being able to exploit <a href="https://developer.mozilla.org/en/Callgraph">callgraph</a> to the fullest. Perhaps I can find another analysis to start work on as well.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ehren.wordpress.com/418/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ehren.wordpress.com/418/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ehren.wordpress.com/418/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ehren.wordpress.com/418/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ehren.wordpress.com/418/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ehren.wordpress.com/418/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ehren.wordpress.com/418/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ehren.wordpress.com/418/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ehren.wordpress.com/418/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ehren.wordpress.com/418/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ehren.wordpress.com/418/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ehren.wordpress.com/418/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ehren.wordpress.com/418/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ehren.wordpress.com/418/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=418&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ehren.wordpress.com/2010/02/08/a-static-analysis-for-fallthrough-switch-cases-in-c/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/036e871fb77da20c503fb7438ca5ca6a?s=96&#38;d=identicon&#38;r=R" medium="image">
			<media:title type="html">ehren</media:title>
		</media:content>
	</item>
		<item>
		<title>Computing connected components with SQLite (and other attempts finding dead code)</title>
		<link>http://ehren.wordpress.com/2010/02/04/computing-connected-components-with-sqlite-and-other-attempts-finding-dead-code/</link>
		<comments>http://ehren.wordpress.com/2010/02/04/computing-connected-components-with-sqlite-and-other-attempts-finding-dead-code/#comments</comments>
		<pubDate>Thu, 04 Feb 2010 05:39:37 +0000</pubDate>
		<dc:creator>ehren</dc:creator>
				<category><![CDATA[Seneca]]></category>

		<guid isPermaLink="false">http://ehren.wordpress.com/?p=409</guid>
		<description><![CDATA[In the last couple of weeks I&#8217;ve put a bit of effort into identifying unused functions in mozilla-central. Unfortunately, I have not been keeping up with this blog, which is a major requirement in DPS911. As I previously reported, I&#8217;ve been using a Treehydra generated call graph (callgraph) which records caller-callee relationships in a giant [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=409&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In the last couple of weeks I&#8217;ve put a bit of effort into identifying unused functions in mozilla-central. Unfortunately, I have not been keeping up with this blog, which is a major requirement in DPS911. </p>
<p>As I previously reported, I&#8217;ve been using a Treehydra generated call graph (<a href="https://developer.mozilla.org/en/Callgraph">callgraph</a>) which records caller-callee relationships in a giant sqlite3 database. As mentioned, my first attempt was to ignore the directedness of each edge in this graph, and then compute the connected components. I believe this is equivalent to computing the weakly connected components in a directed graph. </p>
<p>I have based my code almost entirely off work done David MacIver who has an interesting post <a href="http://www.trampolinesystems.com/blog/machines/2008/11/19/computing-connected-graph-components-via-sql/">here</a> where you can read more about his algorithm. It works by initially placing each verticex into its own component and then iteratively merging distinct components where there is an edge between two vertices in the respective components. Unfortunately, his Ruby written for Mysql is not directly usable for my purposes since SQLite, which I&#8217;m working with, does not support the update join syntax. Here&#8217;s a bit of MacIver&#8217;s code for example:</p>
<p><pre class="brush: sql;">
    update items
    join (
      select component1 source, min(component2) target
      from components_to_merge
      group by source
    ) new_components
    on new_components.source = component_id
    set items.component_id = least(items.component_id, target)
</pre></p>
<p>To create the equivalent code, I used a temporary table and a bit of kludge:</p>
<p><pre class="brush: sql;">
    INSERT INTO components_to_merge
    SELECT component2, component1 FROM components_to_merge;

    DROP TABLE IF EXISTS new_components;
    CREATE TABLE new_components
    AS 
    SELECT component1 source, min(component2) target
    FROM components_to_merge
    GROUP BY source;

    UPDATE node
    SET componentID = min((SELECT min(componentID, target)
                           FROM new_components
                           WHERE source = componentID),
                          componentID)
    WHERE componentID in
        (SELECT source FROM new_components);
</pre></p>
<p>That update statement is horribly inefficient but ultimately the job gets done. All told, it takes about an hour on one of the CDOT&#8217;s development machines. As MacIver noted in his original blog post,  you&#8217;re basically done once the first merge has completed. The complete script, written in python, can be viewed <a href="http://matrix.senecac.on.ca/~egmetcalfe/alg.py">here</a>.</p>
<p>Note that a few domain specific refinements had to take place here. For example, any indirect calls to virtual functions will be registered as a call to the function in the base class. Callgraph already deals with this using an implementors table but I had to add an extra column for the return type of the function in order to accurately construct the name of both member functions. I then merge their components:</p>
<p><pre class="brush: python;">
# set interface component to implementor component 
cursor.execute('SELECT implementor, interface, method, type \
                FROM implementors')
list = cursor.fetchall()
for row in list:
    # format: type interface::method
    basename = &quot;%s %s::%s&quot; %(row[3], row[1], row[2])
    # format: type implementor::method
    derivedname = &quot;%s %s::%s&quot; %(row[3], row[0], row[2])
    cursor.execute(&quot;UPDATE node \
                    SET componentID = (SELECT componentID \
                                       FROM node \
                                       WHERE name = '%s') \
                    WHERE name = '%s'&quot; %(basename, derivedname))
</pre>  </p>
<p>Also, whenever a function is called outside of the compilation unit in which it&#8217;s defined, the call will point to the location of the forward declaration of the function (eg the .h file). To get around this, I simply merge all components that contain nodes with the same name:</p>
<p><pre class="brush: sql;">
SELECT n.name, min(nn.componentID)
FROM node n, node nn
WHERE n.name = nn.name AND
n.componentID != nn.componentID
GROUP by n.componentID
</pre></p>
<p>Of course, some nuance is required to interpret the data produced by this script. As I mentioned previously, the analysis is complicated by the fact that many functions &#8220;never called&#8221; are in fact called indirectly via a function pointer. This means that most connected components of size one aren&#8217;t particularly interesting. In fact, since I&#8217;m merging declarations and forward declarations, most components of size two also aren&#8217;t particularly interesting (at least those with elements containing the same name). <a href="http://matrix.senecac.on.ca/~egmetcalfe/verify.py">This script</a> takes the above into account and can be used to produce these results:</p>
<p><a href="http://matrix.senecac.on.ca/~egmetcalfe/verify.txt">A list</a> of connected components (that aren&#8217;t the largest component) with size greater than 3<br />
and<br />
<a href="http://matrix.senecac.on.ca/~egmetcalfe/interesting-couples.txt">A list</a> of components of size 2 (that aren&#8217;t the largest component) that have members with distinct names.</p>
<p>The vast majority of these are false positives mostly for the reasons mentioned above, but In the last week I finally bit the bullet and <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=543255">filed a bug</a>, along with a patch, to remove a bit of the dead code identified. Unfortunately, these functions are all in the cairo library, which means that applying such a patch would make tracking upstream changes unnecessarily difficult.</p>
<p>A number of issues have been identified here though. Previously, in connection with an analysis to find non-static functions called only within a particular compilation unit, Taras filed <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=537857">this bug</a> to add <code>-fdata-sections<br />
-ffunction-sections -Wl,-gc-sections</code> to the build config in order to strip out dead symbols. Interestingly enough, running a few of my own tests with this configuration, I was not able to strip out all of these dead cairo functions. Even more bizarrely, the symbols of a number of dead static (!) functions seemed to work themselves into libxul?!</p>
<p>As an aside, the <a href="http://www.opengroup.org/onlinepubs/9699919799/utilities/nm.html">nm</a> utilty would have been quite useful when working on the alwayszero project. (To get similar results I was previously running debug builds and then grepping the output of objdump).</p>
<p><strong>A path based algorithm</strong><br />
There are issues with the above analysis though. For example, a function could be unused yet call functions that are used. <a href="http://matrix.senecac.on.ca/~egmetcalfe/path.py">This script</a> begins at a particular node and iterates backwards, transitively adding all callers. If the callers to be added are exhausted within a reasonable number of iterations, we&#8217;ve found a dead path. Note that to narrow down this analysis, I&#8217;ve disregarded everything already identified by the component based analysis ie I do not start with any node not in the largest connected component. </p>
<p>One thing I should note is that this script requires explicit transaction management to run (using <code>cursor.execute("BEGIN TRANSACTION")</code> and <code>connection.commit()</code>). For anyone faced with mysterious insert/update anomalies in sqlite, this may be the answer.    </p>
<p>Anyway, getting useful data out of this is a little bit more complicated. For example, to see info about every function in all paths that contain only functions, the following query can be used:</p>
<p><pre class="brush: sql;">
SELECT * FROM node n
JOIN path p ON (n.id = p.id)
WHERE pathID NOT IN 
(SELECT pathID FROM node n
 JOIN path p ON (p.id = n.id)
 WHERE isMethod = 0) 
ORDER BY pathID;
</pre></p>
<p>Has this analysis yielded any more candidates for removal though? Unfortunately, I have not taken indirect calls to overridden member functions into account eg <code>Base* b = new Derived(); b-&gt;foo();</code>, so an analysis for dead methods is currently out. There is an easy fix, however, it will require another 18 hr run of the script (building a path backward from a start set of 100000+ nodes = a boatload of time). Analyzing just functions though, ie using the query above, I have found some interesting possibilities. For example, there&#8217;s quite a bit of dead stuff in <a href="http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/src/nameprep.c">nameprep.c</a>, but I suspect this is an even bigger nightmare to patch than cairo.</p>
<p>I will report back soon, however, when I have more concrete results.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ehren.wordpress.com/409/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ehren.wordpress.com/409/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ehren.wordpress.com/409/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ehren.wordpress.com/409/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ehren.wordpress.com/409/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ehren.wordpress.com/409/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ehren.wordpress.com/409/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ehren.wordpress.com/409/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ehren.wordpress.com/409/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ehren.wordpress.com/409/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ehren.wordpress.com/409/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ehren.wordpress.com/409/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ehren.wordpress.com/409/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ehren.wordpress.com/409/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=409&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ehren.wordpress.com/2010/02/04/computing-connected-components-with-sqlite-and-other-attempts-finding-dead-code/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/036e871fb77da20c503fb7438ca5ca6a?s=96&#38;d=identicon&#38;r=R" medium="image">
			<media:title type="html">ehren</media:title>
		</media:content>
	</item>
		<item>
		<title>Project plan for DPS911 &#8211; Open Source Project</title>
		<link>http://ehren.wordpress.com/2010/01/15/project-plan-for-dps911-open-source-project/</link>
		<comments>http://ehren.wordpress.com/2010/01/15/project-plan-for-dps911-open-source-project/#comments</comments>
		<pubDate>Sat, 16 Jan 2010 00:43:27 +0000</pubDate>
		<dc:creator>ehren</dc:creator>
				<category><![CDATA[Seneca]]></category>

		<guid isPermaLink="false">http://ehren.wordpress.com/?p=382</guid>
		<description><![CDATA[It&#8217;s a new semester and I&#8217;m in a new class focused on open source development. DPS911 is the continuation of DPS909 and requires students to make 7 releases over the 14 week term. I&#8217;ll mostly be focused on static analysis work related to Mozilla. Wrapping up my old project I likely will not be putting [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=382&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s a new semester and I&#8217;m in a new class focused on open source development. <a href="http://zenit.senecac.on.ca/wiki/index.php/DPS911">DPS911</a> is the continuation of <a href="http://zenit.senecac.on.ca/wiki/index.php/DPS909">DPS909</a> and requires students to make 7 releases over the 14 week term. I&#8217;ll mostly be focused on static analysis work related to Mozilla.</p>
<p><strong>Wrapping up my old project</strong><br />
I likely will not be putting any more work (this semester) into my <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=517370">treehydra analysis + gcc plugin</a> for virtual functions that always return zero. However, I should mention a few things that weren&#8217;t quite resolved as of my <a href="http://ehren.wordpress.com/2009/12/15/mystery-bytes-3/">last post</a>. After reducing the number of instances where the plugin is invoked to optimize value returning function calls, I was able to conclude that no significant size reduction is made in any of mozilla&#8217;s shared libraries. By identifying the functions which are optimized, however, I was able to check if they are present in, eg, libxul at all (this was accomplished by running a debug build and then grepping the <a href="en.wikipedia.org/wiki/Objdump">objdumps</a>). They are, in fact, not present and I suspect if I put more effort into patching all of the functions identified and/or using <a href="https://developer.mozilla.org/en/Callgraph">callgraph</a> to find functions whose return value depends solely on other functions which always return zero, I might have better results.</p>
<p><strong>A new project</strong><br />
At least initially, I will be putting my effort into finding and <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=457262">removing dead code</a>. I had a bit of a warm up for this over the break by working on an analysis to find functions called only within the compilation unit of their definition that aren&#8217;t static(<a href="https://bugzilla.mozilla.org/show_bug.cgi?id=536427">bug 536427</a>). This bug illustrates some of the problems I will encounter with dead code elimination, namely that it&#8217;s not always clear whether a function is part of a public api (and thus not a candidate for removal or demarcation as static). This may be my biggest obstacle since I&#8217;m pretty sure the only solution is a manual examination/understanding of the code. As an aside, I attempted to read a little into linkage issues that might have bearing on whether or not a symbol is exported as part of a public api. For example, I probably shouldn&#8217;t be looking at functions marked with eg <a href="http://msdn.microsoft.com/en-us/library/3y1sfaz2.aspx">dllexport</a> (Windows specific). In this regard, I was curious what marking a declaration as <code>extern</code> would imply. Interestingly, the author of <a href="http://en.wikipedia.org/wiki/Gold_%28linker%29">gnu gold</a> asserts that the extern keyword as applied to function declarations is <a href="http://gcc.gnu.org/ml/gcc/2009-04/msg00812.html">entirely superfluous</a>. </p>
<p>The other stumbling block is function pointers or rather the fact that many functions are only called via a pointer. This makes it nearly impossible to get any useful data from simple call graph queries like &#8220;which functions are never called&#8221;. I&#8217;m pretty sure it would be easy to extend callgraph to note whenever a function address is assigned to a pointer/used as a call arg, so long as this happens within a function body. The problem is this frequently happens outside of a function body eg there are tonnes of globally defined and initialized jump tables (particularly in the C code).</p>
<p>Ultimately, I think manually identifying functions whose address escapes in a pointer will be the best approach for my purposes. However, <a href="http://blog.mozilla.com/tglek/">Taras</a> has mentioned that getting more of this kind of data (like which functions are dead or improperly not static) into <a href="https://wiki.mozilla.org/IT/Production_Acceptance/DXR">DXR</a> would be helpful (and so cutting down on the false positives mechanically might not be the worst idea).</p>
<p><strong>Release Plan</strong><br />
I haven&#8217;t yet given too much detail about where I&#8217;m going with dead code but I expect that work on this should cover at least from 0.4 to 0.6. After this I may start work on what seems like a complicated bug involving <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=531406">symbol rearrangement to improve Firefox startup time</a>. I may also try my hand at more static analysis scripts for detecting coding errors if time permits. Over the break, I did some work on an analysis pass for finding unreachable blocks in the control flow graph (<a href="https://bugzilla.mozilla.org/show_bug.cgi?id=535646">bug 535646</a>) and it was quite fun finding errors in obscure places (in fact, I have more to file). I&#8217;m also curious about what other useful analyses can be carried out without tracking any values.</p>
<p>Anyway, here&#8217;s a tentative schedule for the next few releases:</p>
<p><strong>0.4</strong> &#8211; Finalize my current graph based dead code detection algorithm and hopefully get a few dead functions filed. I&#8217;ll explain more about this in a coming post, but my current analysis treats the <a href="http://en.wikipedia.org/wiki/Call_graph">call graph</a> as an undirected graph and then computes the <a href="http://en.wikipedia.org/wiki/Connected_component_%28graph_theory%29">connected components</a> to find the dead stuff. This likely ignores a wide swath of potentially dead functions ie those which call live functions but are not called by any live functions.</p>
<p><strong>0.5/0.6</strong> &#8211; Consider directedness in the analysis. Upon reflection, this is more complicated than I had previously considered. </p>
<p>I&#8217;ll update this release schedule with info on 0.7 to 1.0 once I better know what needs to be done.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ehren.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ehren.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ehren.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ehren.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ehren.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ehren.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ehren.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ehren.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ehren.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ehren.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ehren.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ehren.wordpress.com/382/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ehren.wordpress.com/382/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ehren.wordpress.com/382/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=382&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ehren.wordpress.com/2010/01/15/project-plan-for-dps911-open-source-project/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/036e871fb77da20c503fb7438ca5ca6a?s=96&#38;d=identicon&#38;r=R" medium="image">
			<media:title type="html">ehren</media:title>
		</media:content>
	</item>
		<item>
		<title>Mystery bytes</title>
		<link>http://ehren.wordpress.com/2009/12/15/mystery-bytes-3/</link>
		<comments>http://ehren.wordpress.com/2009/12/15/mystery-bytes-3/#comments</comments>
		<pubDate>Tue, 15 Dec 2009 07:10:01 +0000</pubDate>
		<dc:creator>ehren</dc:creator>
				<category><![CDATA[Seneca]]></category>

		<guid isPermaLink="false">http://ehren.wordpress.com/?p=347</guid>
		<description><![CDATA[As previously reported, I&#8217;ve been trying to determine why my gcc plugin to optimize away virtual functions that always return zero has resulted in a rather mediocre code size reduction. In fact, the problem is far worse that I previously realized. It turns out the plugin has been adding 12240 bytes to libxul.so! This was [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=347&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>As previously reported, I&#8217;ve been trying to determine why my <a href="https://bugzilla.mozilla.org/attachment.cgi?id=416042">gcc plugin</a> to optimize away virtual functions that <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=517370">always return zero</a> has resulted in a rather mediocre code size reduction. In fact, the problem is far worse that I previously realized. </p>
<p>It turns out the plugin has been adding 12240 bytes to libxul.so! This was truly horrible, but I have found the cause. As I <a href="http://ehren.wordpress.com/2009/12/04/possible-reasons-for-poor-binary-size-reduction/">mentioned before</a>, the vast majority of the &#8216;optimizations&#8217; performed by the plugin were quite useless. Any code of the form <code>return call();</code> is considered by GCC at the <a href="http://gcc.gnu.org/onlinedocs/gccint/GIMPLE.html">GIMPLE</a> level to be a &#8216;value returning gimple call&#8217; which will end up with a <a href="http://en.wikipedia.org/wiki/Three_address_code">3 address code</a> representation like so:<br />
<code>&nbsp;&nbsp;&nbsp;&nbsp;t1 = call();<br />
&nbsp;&nbsp;&nbsp;&nbsp;return t1;</code></p>
<p>The plugin then uselessly converts this to:<br />
<code>&nbsp;&nbsp;&nbsp;&nbsp;dummy_var = call();<br />
&nbsp;&nbsp;&nbsp;&nbsp;t1 = 0;<br />
&nbsp;&nbsp;&nbsp;&nbsp;return t1;</code></p>
<p>The problem is that this adds a significant amount of cruft to the generated code. It turns out GCC, and probably most compilers, when encountering <code>return call();</code>, will simply jump to the callee, letting the <a href="http://en.wikipedia.org/wiki/Function_prologue#Epilogue">epilogue</a> of the callee do the work of the caller. For example, I previously used the example of nsHTMLModElement::GetAttributeNS which looks like this:</p>
<p><pre class="brush: cpp;"> 
NS_SCRIPTABLE NS_IMETHOD
GetAttributeNS(const nsAString &amp; namespaceURI, const nsAString &amp; localName,
	       nsAString &amp; _retval NS_OUTPARAM)
{
  return _to GetAttributeNS(namespaceURI, localName, _retval);
}
</pre> </p>
<p>(See <a href="http://ehren.wordpress.com/2009/12/04/possible-reasons-for-poor-binary-size-reduction/">this post</a> for a GIMPLE breakdown)</p>
<p>Without my plugin, GCC generates this code for the function:  </p>
<p><pre class="brush: plain; toolbar: false; wrap-lines: false;">
0000000000000000 &lt;_ZN16nsHTMLModElement14GetAttributeNSERK18nsAString_internalS2_RS0_&gt;:
   0: e9 00 00 00 00        jmpq   5 &lt;_ZThn56_N16nsHTMLModElement14GetAttributeNSERK18nsAString_internalS2_RS0_&gt;
</pre></p>
<p>Here the <a href="http://en.wikipedia.org/wiki/Function_prologue">prologue</a> is left off as well since all behavior is deferred to the callee. With the &#8216;optimization&#8217; however the return value is explicitly zeroed out, and so GCC must generate code for a complete procedure:</p>
<p><pre class="brush: plain; toolbar: false; wrap-lines: false;">
0000000000000000 &lt;_ZN16nsHTMLModElement14GetAttributeNSERK18nsAString_internalS2_RS0_&gt;:
   0: 48 83 ec 08           sub    $0x8,%rsp
   4: e8 00 00 00 00        callq  9 &lt;_ZN16nsHTMLModElement14GetAttributeNSERK18nsAString_internalS2_RS0_+0x9&gt;
   9: 31 c0                 xor    %eax,%eax
   b: 48 83 c4 08           add    $0x8,%rsp
   f: c3                    retq
</pre></p>
<p>So what is to be done?</p>
<p>My first thought was to abandon this approach to the optimization entirely and go back to something like my <a href="http://ehren.wordpress.com/2009/10/24/a-gcc-hack-my-0-1-release/">0.1 release</a>. I was even able to build a do nothing implementation of the <a href="http://gcc.gnu.org/viewcvs/trunk/gcc/tree-ssa-propagate.c?revision=154645&amp;view=markup">propagation engine</a> by copying <a href="http://gcc.gnu.org/viewcvs/trunk/gcc/tree-ssa-propagate.h?view=markup&amp;revision=151968">tree-ssa-propagate.h</a> into the GCC plugin include directory. Unfortunately, writing one of these passes requires a lot of code because you have to do all of the <a href="http://en.wikipedia.org/wiki/Constant_folding">folding</a> yourself. I also don&#8217;t think it&#8217;s possible to just copy all of the code from the <a href="http://gcc.gnu.org/viewcvs/trunk/gcc/tree-ssa-ccp.c?revision=154645&amp;view=markup">conditional constant propagation</a> pass into a plugin (I&#8217;ve tried&#8230; GCC doesn&#8217;t export enough to the plugin framework).</p>
<p>Another approach was to add a couple of ad hoc conditions to the plugin so that statements of the form <code>return call();</code> are not &#8216;optimized&#8217;. One thing I noticed was that if a function has only one return of the form <code>return call(); </code>, that return will be in the same basic block as the call. Therefore I could use code like this to weed out the boring stuff:</p>
<p><pre class="brush: cpp;">
static bool
boring_block (basic_block bb)
{
  gimple ret_stmt = gimple_seq_last_stmt (bb_seq (bb));
  return gimple_code (ret_stmt) == GIMPLE_RETURN;
}
</pre></p>
<p>This worked pretty well. The number of &#8216;optimized&#8217; call sites went down to 55 (from 1305) which was just what I wanted given that the majority of those call sites are not optimization worthy. Unfortunately, I was still increasing increasing the size of libxul.so, except this time by 14 bytes.</p>
<p>One possibility was that functions with multiple returns end up looking like this (this is after the &#8216;optimization&#8217; has taken place):</p>
<p><pre class="brush: plain; wrap-lines: false;">
&lt;bb 6&gt;:
  D.74138_10 = &amp;this_6(D)-&gt;D.72191;
  dummy_var.24_17 = nsSVGGFrame::AttributeChanged (D.74138_10, aNameSpaceID_2(D), aAttribute_4(D), aModType_11(D));
  D.74137_12 = 0;

&lt;bb 7&gt;:
  # D.74137_1 = PHI &lt;0(5), D.74137_12(6)&gt;
  return D.74137_1;
}
</pre></p>
<p>(See the code for this function <a href="http://scotland.proximity.on.ca/dxr/mozilla-central/layout/svg/base/src/nsSVGUseFrame.cpp.html#l137">here</a>. Note that it has one 0-constant return and one &#8216;alwayszero call&#8217; return. It doesn&#8217;t fit my <code>boring_block</code> condition above, but this is still a useless optimization.)</p>
<p>So I moved onto the next approach. It appears that any time the return value is checked the basic block of the call will contain a conditional statement of some form (see the GIMPLE code for nsSplittableFrame::AttributeChanged in <a href="http://ehren.wordpress.com/2009/12/04/possible-reasons-for-poor-binary-size-reduction/">this post</a>, for an example). This led me to a new condition for only performing useful optimizations:</p>
<p><pre class="brush: cpp;">
static bool
is_interesting_block (gimple_stmt_iterator gsi)
{
  for (; !gsi_end_p (gsi); gsi_next (&amp;gsi))
    {
      gimple stmt = gsi_stmt (gsi);
      enum gimple_code code = gimple_code (stmt);
      if (code == GIMPLE_COND || code == GIMPLE_SWITCH)
        return true;
    }
  return false;
}
</pre></p>
<p>With this, the number of optimized call sites is reduced to 26, and I&#8217;m certain all of these optimizations result in useful code pruning (although I haven&#8217;t run a tree-dump build yet). Here&#8217;s the weird thing though: the size results are exactly the same as when I was optimizing 55 callsites. That is, the plugin is still adding 14 bytes to libxul.so! Argh.</p>
<p>Here&#8217;s an even bigger mystery: I ran a build using a &#8216;disabled&#8217; version of the plugin. It registers a new pass, but no optimizations are performed. The results are identical to when I was running the plugin with 55 callsites optimized and with 26 callsites optimized: 14 bytes are added to libxul.so. My first thought is that adding a new pass somehow interferes with existing GCC optimizations, especially since the plugin must set the <code>TODO_update_ssa</code> flag (since a new statement is added). Why do my 26 useful optimizations not have any affect though?</p>
<p>I&#8217;m going to have to run another <a href="http://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html">tree-dump</a> build of mozilla-central and then play around a bit more with <a href="http://en.wikipedia.org/wiki/Objdump">objdump</a>. Something is very strange here.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/ehren.wordpress.com/347/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/ehren.wordpress.com/347/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/ehren.wordpress.com/347/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/ehren.wordpress.com/347/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/ehren.wordpress.com/347/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/ehren.wordpress.com/347/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/ehren.wordpress.com/347/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/ehren.wordpress.com/347/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/ehren.wordpress.com/347/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/ehren.wordpress.com/347/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/ehren.wordpress.com/347/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/ehren.wordpress.com/347/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/ehren.wordpress.com/347/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/ehren.wordpress.com/347/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=ehren.wordpress.com&amp;blog=9426727&amp;post=347&amp;subd=ehren&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://ehren.wordpress.com/2009/12/15/mystery-bytes-3/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/036e871fb77da20c503fb7438ca5ca6a?s=96&#38;d=identicon&#38;r=R" medium="image">
			<media:title type="html">ehren</media:title>
		</media:content>
	</item>
	</channel>
</rss>
