When to use __fastcall

by Kent Reisdorph

A certain amount of confusion exists among C++Builder programmers regarding the __fastcall keyword in VCL applications. This article will explain when to use __fastcall , and, more importantly, when not to use it.

Calling conventions

First I should say that __fastcall is the keyword used to specify the Register calling convention. A calling convention tells the compiler how it should call functions; how the function name will be exported, how the functionís parameters should be passed, and who is responsible for cleaning up the stack (called function or calling function). Table A lists the calling conventions supported by C++Builder and their associated keywords.

Table A: Calling conventions and their keywords.

Calling convention

Keyword

C

__cdecl

Standard call

__stdcall

Register

__fastcall

Pascal

__pascal

The default calling convention for a C++Builder project is the C calling convention. You can see this by looking at the Advanced Compiler tab of the Project Options dialog. (I wonít attempt to explain the details of calling conventions in this article. If you wish to learn more about calling conventions, search for ďcalling conventionsĒ in the C++Builder help. You will find several help topics dedicated to this subject.)

The important thing to understand is that the default calling convention for a C++Builder project is the C calling convention, and that the default for the VCL is the Register calling convention.

__fastcall and the VCL

All functions in the VCL are exported using the Register calling convention. This is most obvious when you generate an event handler. Take the OnClick event handler, for example:

void __fastcall 
TForm1::FormCreate(TObject *Sender)
{
}

Itís obvious from looking at this declaration that the Register calling convention is being used (the __fastcall keyword proceeds the function name). If for some strange reason you were tempted to remove the __fastcall keyword from an event handler you would find that the application wouldnít even compile. Itís a fact that __fastcall is required when dealing with VCL event handlers, and there is nothing you can (or should want to) do about that.

When to use __fastcall

C++Builder programmers (regardless of their experience level) often ask when they should use the __fastcall keyword. My response is always the same: Use __fastcall when you must, and never use it when you donít have to. That rather broad statement requires a bit of explanation.

As I have said, the C++Builder IDE automatically adds the __fastcall keyword for event handlers that it generates. This is as it should be, and you donít have to give it much thought. Sometimes, however, you will have to assign an event handler to a VCL event via code. In that case you will need to declare the event handler yourself and assign it to the event you are interested in handling. In this case you must use the __fastcall keyword when you declare the function.

Some VCL classes, such as TList, allow you to specify a callback function (a sort routine in the case of TList). You will have to use the __fastcall keyword in this case, too, as the VCL expects it.

That sums up when you must use __fastcall but it doesnít address those situations where use of this keyword is optional. Obviously, the IDE takes care of generating event handlers for you. In most applications, though, you will certainly have to add functions of your own to your main unitís class, and in supporting units. When declaring these functions you can use any calling convention you like. You might be tempted to use __fastcall when you declare these functions. After all, that is what you see in IDE-generated functions and it is common to use those function declarations as a guide. However, it is not required that you use __fastcall for your internal functions.

I never use __fastcall for my internal functions and my advice is that you donít either. In short, just declare your internal functions with no calling convention at all and let the compiler use the project options to determine the calling convention. The primary reason for this advice is that using __fastcall is simply not necessary. There is another reason, however, that I will explain in the following section.

Is __fastcall actually fast?

If you look up calling conventions in the C++Builder help you will find that the Register calling convention specifies that function parameters should be passed in the CPU registers. (The other calling conventions pass function parameters on the stack.) Obviously, there are only so many registers available. The compiler will use registers if they are available, but otherwise will use the stack if no registers are available for passing function parameters.

At first glance it would appear that passing function parameters in registers would be much faster than passing parameters on the stack. Not long ago I received an email from a reader regarding use of __fastcall. The reader said that his tests indicated that the Register calling convention was actually slower than the default calling convention. Curious, I decided to conduct a test. I created a simple application that had four functions that were variations on the following:

int TForm1::GetValue(int x,int y,int z)
{
  return x + y + z;
}

Naturally, I declared each of the functions to use one of the four available calling conventions.

Next I added four buttons to the form, one for each of the calling conventions. The OnClick event handler for the button that calls the __cdecl version of the function looks like this:

void __fastcall 
TForm1::CBtnClick(TObject *Sender)
{
  CBtn->Enabled = false;
  for (int i=0;i<500000;i++)
    int result = GetValueC(1, 2, 3);
  CBtn->Enabled = true;
}

As you can see, this code simply calls the GetValueC() function 500,000 times.

Finally, I ran the application under StopWatch, the profiler that ships with TurboPowerís Sleuth QA Suite. (I could have used GetTickCount() to time the results, but the results would not have been nearly as accurate.) I clicked each button five times to get an average. Figure A shows a screen shot of StopWatch reporting the results.

Figure A

StopWatch shows that the Register calling convention is not particularly fast.

I found the results interesting to say the least. The C, Standard call, and Pascal calling conventions were very close in total execution time. Running the applications several times, I noticed slight differences in which of these three calling conventions produced the fastest results. Regardless of who came out the winner, they were all very close. What was most noticeable, however, was that the Register calling convention consistently produced the worst results. Roughly speaking, the Register calling convention proved to be about 10% slower than any other calling convention, at least when using the default project options.

In an attempt to be fair, I tried a variety of compiler options. I first changed the Optimization option to optimize for speed. This improved the performance of the Register calling convention somewhat, but not enough to matter (the Pascal calling convention was the loser in this particular test). Next I changed the Instruction set to Pentium and the Register variables option to Automatic. This made some difference as well, but again, not enough to matter.

The bottom line with my tests was that the Register calling convention was not any faster than the C calling convention and was, in most cases, slower. Granted, the differences in time are miniscule but I proved to myself that it would be difficult for proponents of __fastcall to claim that the Register calling convention is faster than any other.

Conclusion

To repeat what I have already said, I advise that you never use __fastcall unless it is specifically required by the VCL. Using __fastcall isnít necessary, it clutters up your code, and it appears to execute a bit slower than the default calling convention.