Using Watcom's Register-based Calling Convention With TASM
I suppose I'm writing this post for my own benefit primarily. I'll likely forget many of these details in a month, and then go and try to write a bunch more assembly and run into problems. So I'll try to proactively solve that future problem for myself. Everything here is better documented in the compiler documentation. However, it is scattered around a bit and of course isn't written with specific examples for using TASM.
One of the performance benefits that Watcom brought with it that was a pretty big deal at the time was that it's default calling convention used registers for up to the first 4 arguments to called functions. Past that, and the stack would be used as per standard C calling conventions.
As mentioned this calling convention is the default, but it can be globally changed via the CPU instruction code
generation compiler switch. For example, /3
and /3r
both select 386 instructions with register-based calling
convention, while /3s
selects 386 instructions with stack-based calling convention.
Borland Turbo Assembler (TASM) does not natively support this register-based calling convention among it's varied support for programming-language specific calling conventions. However it does let you use it's "NOLANGUAGE" option (which is the default if no language is specified) and then you can handle all the details yourself.
ideal
p386
model flat
codeseg
locals
public add_numbers_
; int add_numbers(int a, int b)
; inputs:
; eax = a
; edx = b
; return:
; eax
proc add_numbers_ near
push ebp
mov ebp, esp
add eax, edx
pop ebp
ret
endp
end
This is pretty normal looking TASM. Complete with normal looking assembly prologue and epilogue code. Note that we are intentionally not specifying a language modifier.
So, first off, add_numbers_
has a trailing underscore to match what Watcom expects by default. If you don't like this
for whatever reason, you can change the name here to your liking, but the use of a #pragma
in your C code is
necessary to inform Watcom about the different naming convention for this function.
Second, via the magic of the register-based calling convention, Watcom will have our two number arguments all ready for
us in eax
and edx
. Our return value is assumed to be in eax
, and that is correct in our case so we're all good.
The great thing is, we don't actually need to do anything fancy to call this function from our C code.
// prototype
int ;
// usage
int result;
result = ;
But that was the simple case.
This register-based calling convention actually places the burden on the called function to clean things up before
returning. This includes preserving some register values as well. According to the documentation: "All used 80x86
registers must be saved on entry and restored on exit except those used to pass arguments and return values." So, in
our add_numbers_
function if we had wanted to use ecx
, we would need to push and pop it during the prologue and
epilogue code. But we didn't need to do so for eax
and edx
because those were used to pass arguments and return a
value.
As mentioned previously, the stack gets used for arguments once all the registers have been used for arguments
(by default, eax
, edx
, ebx
, ecx
in that order). In this case, the called function is responsible for popping
them off the stack when it returns. So, if there were two int
arguments that were passed on the stack, we would need
to do a ret 8
to return.
; For this function, using the default register calling convention, the first 4 arguments
; will be passed in registers eax, edx, ebx and ecx. The last two will be passed on the stack.
; void direct_blit_4(int width4,
; int lines,
; byte *dest,
; byte *src,
; int dest_y_inc,
; int src_y_inc);
proc direct_blit_4_ near
arg @@dest_y_inc:dword, @@src_y_inc:dword
push ebp
mov ebp, esp ; don't try to be clever and move this elsewhere!
push edi ; likewise, don't try to group the push's all together!
push esi
; code here (that also modifies edi and esi, thus the additional pushs/pops)
pop esi
pop edi
pop ebp
ret 8
endp
Is this all too cumbersome to worry about? Well, I don't really think it's a big deal, but there is a way we can remove ourselves from this burden.
Let's say we didn't want to have to worry about preserving any of eax
, ebx
, ecx
, edx
, edi
, or esi
regardless
of how many arguments our function has and what (if any) return value it uses. Also, maybe we don't want to have to
worry about popping arguments off the stack ourselves when our assembly functions return.
// define our "asmcall" calling convention
int ; // no change to the function prototype is necessary
What if we actually wanted to use the normal C stack-based calling convention for our assembly functions and ignore this register argument nonsense? Maybe you're using an existing library and it was written for other compilers that don't use this register-based calling convention.
Watcom also pre-defines the cdecl
symbol for this same purpose, which you can and probably should
use instead of defining your own.
The empty brackets []
denotes an empty register set to be used for parameter passing. That is, we are saying not to
use any registers, so the stack is used instead for all of them. With that in mind, we could expand the set of default
registers used for parameter passing:
In this case the modify
list is redundant and need not be specified.
Of course, saying that your function will use/modify more registers means that the compiler has to work around it before and after calls to your assembly function which may result in less optimal code being generated. There's always a trade off!
None of the above #pragma
s remove the need for the standard prologue and epilogue code that you've seen a thousand
times before:
push ebp
mov ebp, esp
; ...
pop ebp
The only exception is if your assembly function isn't using the stack at all.
There are many details I've left out. For example, passing double
values will mean two registers will get used for one
argument because double
s are 8 bytes. But if you only have one register left (maybe you passed 3 int
s first), then
the double
value will get passed on the stack instead. Additionally there are more details to know when
passing/returning struct
s. But I'm not doing any of this right now, so I've not really looked into it beyond a
passing glance.