Ehren's Blog

Possible reasons for poor binary size reduction

Posted in Seneca by ehren on December 4, 2009

Just for a bit of context here’s the basic reason why I was only able to achieve 24 bytes worth of optimization with the current number of functions patched.

Consider GetAttributeNS which is defined in a macro here

It looks like this:

NS_SCRIPTABLE NS_IMETHOD
GetAttributeNS(const nsAString & namespaceURI, const nsAString & localName,
	       nsAString & _retval NS_OUTPARAM)
{
  return _to GetAttributeNS(namespaceURI, localName, _retval);
}

Here’s what it looks like (in gimple ssa form) after being run through the plugin:

nsHTMLModElement::GetAttributeNS(nsAString_internal const&, nsAString_internal const&, nsAString_internal&) (struct nsHTMLModElement * const this, const struct nsAString_internal & namespaceURI, const struct nsAString_internal & localName, struct nsAString_internal & _retval)
{
  nsresult dummy_var.33;
  struct nsGenericElement * D.44812;
  nsresult D.44811;

<bb 2>:
  D.44812_2 = &this_1(D)->D.43447.D.33567.D.30722.D.30660;
  dummy_var.33_9 = nsGenericElement::GetAttributeNS (D.44812_2, namespaceURI_3(D), localName_4(D), _retval_5(D));
  D.44811_6 = 0;
  return D.44811_6;
}

The optimization’s performed (because it’s a value returning gimple call) but there’s no real code elimination. Even though this function now knows that the call to an alwayszero function is 0 it doesn’t really help because it just returns the value. And the function returning the value is itself virtual!

(Note: It’s somewhat shameless to post the above because I just ripped it out of a pastebin from an irc conversation, but there it is.)

Are there any significant optimizations?

The next question is whether or not my plugin is able to cut out any unnecessary NS_FAILED calls since that’s kind of the point. In fact, it has in some cases.

Consider nsImageFrame::AttributeChanged

It looks like this:

NS_IMETHODIMP
nsImageFrame::AttributeChanged(PRInt32 aNameSpaceID,
                               nsIAtom* aAttribute,
                               PRInt32 aModType)
{
  nsresult rv = nsSplittableFrame::AttributeChanged(aNameSpaceID,
                                                    aAttribute, aModType);
  if (NS_FAILED(rv)) {
    return rv;
  }
  if (nsGkAtoms::alt == aAttribute)
  {
    PresContext()->PresShell()->FrameNeedsReflow(this,
                                                 nsIPresShell::eStyleChange,
                                                 NS_FRAME_IS_DIRTY);
  }

  return NS_OK;
}

nsSplittableFrame::AttributeChanged is an alwayszero function so nsImageFrame::AttributeChanged will be processed by the plugin. Here’s what that processing will do:

nsImageFrame::AttributeChanged(int, nsIAtom*, int) (struct nsImageFrame * const this, PRInt32 aNameSpaceID, struct nsIAtom * aAttribute, PRInt32 aModType)
{
  nsresult dummy_var.291;
  struct nsIPresShell * D.96275;
  struct nsStyleContext * D.96272;
  struct nsRuleNode * D.96271;
  struct nsPresContext * D.96269;
  struct nsPresContext * D.96269;
  nsresult rv;
  int (*__vtbl_ptr_type) (void) D.93660;
  int (*__vtbl_ptr_type) (void) * D.93659;
  int (*__vtbl_ptr_type) (void) * D.93658;
  struct nsIFrame * D.93655;
  struct nsIAtom * alt.65;
  long int D.93648;
  long int D.93647;
  int rv.64;
  struct nsFrame * D.93645;

<bb 2>:
  D.93645_3 = &this_2(D)->D.53019.D.51667;
  dummy_var.291_32 = nsFrame::AttributeChanged (D.93645_3, aNameSpaceID_4(D), aAttribute_5(D), aModType_6(D));
  rv_7 = 0;
  rv.64_8 = (int) rv_7;
  D.93647_9 = rv.64_8 < 0;
  D.93648_10 = __builtin_expect (D.93647_9, 0);
  if (D.93648_10 != 0)
    goto <bb 3>;
  else
    goto <bb 4>;

<bb 3>:
  rv_11 = rv_7;
  goto <bb 7>;

<bb 4>:
  alt.65_12 = alt;
  if (alt.65_12 == aAttribute_5(D))
    goto <bb 5>;
  else
    goto <bb 6>;

<bb 5>:
  D.93655_13 = &this_2(D)->D.53019.D.51667.D.51581.D.46508;
  D.96272_14 = D.93655_13->mStyleContext;
  D.96272_26 = D.96272_14;
  D.96271_27 = D.96272_26->mRuleNode;
  D.96269_28 = D.96271_27->mPresContext;
  D.96269_30 = D.96269_28;
  D.96269_15 = D.96269_30;
  D.96275_16 = D.96269_15->mShell;
  D.96275_29 = D.96275_16;
  D.96275_31 = D.96275_29;
  D.96275_17 = D.96275_31;
  D.93658_18 = D.96275_17->D.15138.D.14613._vptr.nsISupports;
  D.93659_19 = D.93658_18 + 200;
  D.93660_20 = *D.93659_19;
  D.93655_21 = &this_2(D)->D.53019.D.51667.D.51581.D.46508;
  OBJ_TYPE_REF(D.93660_20;D.96275_17->25) (D.96275_17, D.93655_21, 2, 1024);

<bb 6>:
  rv_22 = 0;

<bb 7>:
  # rv_1 = PHI <rv_11(3), 0(6)>
  return rv_1;
}

Clearly the optimization has been performed but has GCC propagated the 0, cutting out the NS_FAILED?

It has, in fact (here’s the IR after a later pass):

nsImageFrame::AttributeChanged(int, nsIAtom*, int) (struct nsImageFrame * const this, PRInt32 aNameSpaceID, struct nsIAtom * aAttribute, PRInt32 aModType)
{
  struct nsIPresShell * D.96275;
  struct nsStyleContext * D.96272;
  struct nsRuleNode * D.96271;
  struct nsPresContext * D.96269;
  struct nsPresContext * D.96269;
  int (*__vtbl_ptr_type) (void) D.93660;
  int (*__vtbl_ptr_type) (void) * D.93659;
  int (*__vtbl_ptr_type) (void) * D.93658;
  struct nsIFrame * D.93655;
  struct nsIAtom * alt.65;
  struct nsFrame * D.93645;

<bb 2>:
  D.93645_3 = &this_2(D)->D.53019.D.51667;
  nsFrame::AttributeChanged (D.93645_3, aNameSpaceID_4(D), aAttribute_5(D), aModType_6(D));
  alt.65_12 = alt;
  if (alt.65_12 == aAttribute_5(D))
    goto <bb 3>;
  else
    goto <bb 4>;

<bb 3>:
  D.96272_14 = this_2(D)->D.53019.D.51667.D.51581.D.46508.mStyleContext;
  D.96271_27 = D.96272_14->mRuleNode;
  D.96269_28 = D.96271_27->mPresContext;
  D.96275_16 = D.96269_28->mShell;
  D.93658_18 = D.96275_16->D.15138.D.14613._vptr.nsISupports;
  D.93659_19 = D.93658_18 + 200;
  D.93660_20 = *D.93659_19;
  D.93655_21 = &this_2(D)->D.53019.D.51667.D.51581.D.46508;
  OBJ_TYPE_REF(D.93660_20;D.96275_16->25) (D.96275_16, D.93655_21, 2, 1024);

<bb 4>:
  return 0;
}

Note that dummy_var is gone too.

I would conclude that the optimization works for Firefox. It’s just that so far I’ve patched some really boring functions!

Advertisements

One Response

Subscribe to comments with RSS.

  1. Monty said, on September 19, 2013 at 2:40 am

    This page truly has all the information I needed about this subject and didn’t know who to ask.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: