The Theory of Binary Compatibility
In a software system made up of a number of independently evolving parts–many of which are evolving,, yet will be
built to work with last year’s version of that system–it is of vital importance to be able to predict and control
the effect of changes made to one component on untold numbers of unknown components that may be dependent on it.
When a change can be shown not to break any law-abiding software that worked with the component before the change,than that change is a backward compatible change. When a change is such that all software that functions correctly
within the new system will also work with the old system then that change is said to be forward compatible.Backward compatibility is generally the overriding requirement, with forward compatibility an added bonus.
(Bug fixes, for instance, are by their very nature not forward compatible.) In descending order of difficulty:Binary compatibility is to do with the effects of putting separately built yetinterdependent parts together; ie, compatibility across link unit boundaries. When we consider the effects of
linking independently compiled pieces of code together we’re talking about link compatibility. And finally,
a change is source compatible when the source code of dependent components works unchanged when compiled with
the new component.All such considerations are aspects of configuration management–the discipline concerned with assembling
working parts into working systems.
When shipping, one is engaging in a particularly awkward form of configuration management–one which is
concerned with binary compatibility and can’t even physically put the resulting system together for testing:
that is done by customers in the field.
We have to do our configuration management in the abstract. We need to be confident about our work’s compatibilitywith whole classes of independently created software products. To gain this confidence we need to make reasonableassumptions about the behaviour of such software, and derive rules as to the kind of change it will withstand.
The principal working assumption we’ll make about software produced to work with our components is that it adheres
to reasonable software engineering practice; ie, rogue code fraudulently gaining access to all sorts of private bitswe’ll break with joyful abandon. (Unless it happens to be that best-selling piece of software, of course.) On the
other hand, if ever some facility or piece of information has been–or given the impression of being–a legitimate part of a supported interface, we are honour-bound to maintain it forever.Additional assumptions we’ll be making are to do with the behaviour of the C++ implementation we’re working with.
We’ll assume certain things that aren’t in the C++ spec, but are nonetheless fairly reasonable. (This is effectively
limiting implementations of Symbian OS to C++ compilers for which those assumptions are true.)For instance, whereas C++ doesn’t define the layout of of class members across access specifiers, we will assume that
access specifiers have no effect on layout, and that we can therefore relax a data member’s access restrictions as
long as we don’t change declaration order. Similarly, we’ll assume that the order of declaration of virtual memberfunctions is the only thing that affects their order in the virtual table. We’ll assume pointers and references sharea representation, and, with respect to multiple inheritance, we’ll make the assumption that a pointer to a certain
class’s representation remains unchanged when it is converted to a pointer to the first base class in declaration order. (The first base class comes first in the class’s layout.)Furthermore, we’ll assume that C++’s type-safe linkage of compilation units is not in force across link units;
ie, we’ll be able to make well-considered changes to the name and signature of link unit entrypoints.
(This requires link by ordinal everywhere.)
An interface is a contract a provider of services enters into with a client. Either party can “own” the interface.I’ll use the term client interface if the interface is best thought of as belonging to the service provider.
If the interface is defined by the client of the service I’ll call it a provider interface. Client interfacesare normally monomorphic whereas provider interfaces can be–and generally are–polymorphic.
In C++ terms, the public interface of a class is an example of a client interface, and so is the protected
interface (which defines services the class provides to its subclasses). A class’s virtual interface is aprovider interface. The base class specifies services to be provided by derived classes. In terms of link units–DLLs–a client
interface setup exists when a DLL is being used as a shared library.The DLL’s interface is a table of exported functions, knowledge of the indices and meaning of which is
shared between DLL and clients. A shared library comes complete with an import library, which encapsulates
this shared knowledge. The preservation of this information across versions is a necessary condition for compatibility. To this end we maintain a definitions file (.def) which is the source equivalent of theimport library. Freeze files (.frz) are just DEFs by another name. Most Symbian OS DLLs serve as shared libraries.
DLLs also work well in a provider interface setup, both in monomorphic and polymorphic flavours.Monomorphic provider DLLs are useful to choose a particular service provider at configuration time.
Symbian OS examples are the console library (ECONS.dll) and the various libraries adapting the O/S
to hardware variants. The client links to an import library generated from a well-defined definitions
file, which is published for use by providers. Providers link their implementation DLLs using the interface DEF file.Polymorphic provider DLLs (drivers, in a broad sense) allow the client to choose a service provider at
runtime. Providers once again link using a DEF file published by the interface owner. However,the latter doesn’t use an import library this time, but dynamically loads the chosen provider DLL instead.
It then uses its knowledge of the exported entrypoints to invoke the provider’s services. In this case,
the DEF file is to be treated as true source code since the DLL’s functions are looked up programmatically.?
Binary Compatibility in Practice
There’s a number of things all component owners need to do in order to gain control over their binary
interface. Once entry-level binary compatibility is assured we can talk about the sorts of changes youwill be able to make and how to tweak your interfaces in order to maximizes the options.
2.0 Exports - DEF Files
To maintain BC from one release to the next, (this is whilst making only implementation changes) every DLL
interface involved needs to have its definitions file preserved, for all builds. ie, Debug, Release, Unicode Debug, Unicode Release. At ACME this means archiving the definitions files using the versioncontrol system. The definitions file lists the exports definitions, ie, a description of those functions
that are exported from your DLL.
For both MARM and WINS builds you can specify the frozen DEFs (master) files to build with. The build
process will then generate new DEFs - identical to the specified frozen ones, (unless you’ve addedentry points - see later - “Adding Services”) and this new DEF will be used to link the DLL. You can tell MAKMAKE to build
using the frozen export files by adding the following into you.MMP project files
DEFFILE component.DEF
MAKMAKE will look for these files in the BMARM and BWINS directories by default.This will ensure that the exports from a subsequent command-line build of the component are
compatible with the current version.
(the build process automatically mangles the DEF file name so the correct one is used, ie.
componentD.def for a Narrow Debug build, componentUD.def for a Unicode Debug build.) (note that some components do not yet conform to this standard. For MARM builds, somecomponents rename the DEFs to be FRZs - that is a freeze file - but these components are,
over time, being converted to the new build system as described here).?
2.1 MARM Exports file
So where do the .DEF files come from in the first place then ? For MARM, well the MARM build process will always generate a new DEF exportsfile which it leaves lying around in the intermediate build directory. This is a
link-by-ordinal DEF file, so all you have to do is create a BMARM project directory and copy the DEF files here, adding the correct suffix
for the variant.eg, copy…
\epoc32\build\comp\marmd\rel\dll.def –> \project\bmarm\dll..def
\epoc32\build\comp\marmd\deb\dll.def –> \project\bmarm\dllD.def
\epoc32\build\comp\marmd\urel\dll.def –> \project\bmarm\dllU.def
\epoc32\build\comp\marmd\udeb\dll.def –> \project\bmarm\dllUD.def
2.2 WINS Exports file
With WINS, things are a little more complicated. The basic idea is the same and involves
archiving the DEF exports file. First you use a tool called DEFMAKE to generate the initial
DEF files (into a BWINS project directory), as follows…
defmake \epoc32\release\wins\rel\dll.dll \project\bwins\dll.def
defmake \epoc32\release\wins\deb\dll.dll \project\bwins\dllD.def
defmake \epoc32\release\wins\urel\dll.dll \project\bwins\dllU.def
defmake \epoc32\release\wins\udeb\dll.dll \project\bwins\dllUD.def
…DEFMAKE reads the Win32 PE file (the WINS binary you have built), extracting names and
ordinals and produces a link-by-ordinal DEF file, which from now on you will use to link
the DLL. Once you have rebuilt your DLL using this new link-by-ordinal DEF file, the resulting PE file will no longer contain names for DEFMAKE to extract. Luckily you will be archiving theexports file (won’t you ?) and you wont normally need to regenerate them, except when you’re
adding services. (see later - “Adding Services”)
3.0 Adding Services
In general, it is possible to add exported services to your published interface. There arerestrictions on the type of services that may be added. See below (”Allowed Changes”) for a description of these. For both MARM and WINS builds,
this is pretty straightforward. All new services get added atthe end of the automatically generated DEF file, following a build of the component. Simply
replacing the original DEF file with this new one will give these new entry points permanentstatus. (The reason a DEF file is specified in the .MMP file is that this is used as a template when the build process generates the
new one. All matching exports maintain the same ordinal, and all new exports are appended to the new DEF file).4.0 Allowed Changes
Now that we have the mechanism in place, let’s look at some of the changes you can or can’t make to
an interface while preserving backward binary compatibility. Naturally, these remarks only apply to constructs which are part of an external interface. You are free to arrange you internal interfaces as you see fit.Add services to a shared library.
Adding classes, global functions, static member functions and non-virtual member functions is fairly straightforward. Remember to avoid name collisions (as always) and to freeze the
new entry points as soon as the new version of the library is released.You can’t generally add or delete virtual member functions, or even change their order of declaration.
You can’t even override an existing virtual function that was previously inherited. (Existing derived
classes would be left inheriting the wrong function).
All changes to private non-virtual (static and non-static) member functions are OK as long as they arenot accessed through public or protected inline functions; either directly or indirectly.Some friends or member functions affected by the change may be in a different link-unit, in which
case you must make sure that the relevant binaries are kept in synch at all times. If this is notpractical then the change must be disallowed.
You can make changes to private data members that are not accessed through any public or protected
inline functions - directly or indirectly - provided that the size of the class remains unchangedand that the offset of all public or protected data members or private members accessible through
public or protected inline functions, directly or indirectly, stay the same. If friends or membersof the class exist in a different link unit then all relevant binaries must be kept in synch.You can relax access specifiers; ie, a
protected member can become public, a private member publicor protected. The reverse is not allowed because it would make it impossible to draw any conclusionsfrom a member’s current access specification. An exception to this rule can be made if a forwarding
inline function is left in its place.Similarly, you can bestow friendship upon additional classes or functions, but, once given it can’t
be withdrawn. Friendship is forever.
You can change the size of a class provided it has only private, non-inline constructors and eithera virtual destructor or if it has a non-virtual destructor it mustn’t declare or inherit an operator
delete() of the form with two arguments. In this case only friends and members can allocate memory
for and construct instances of the class. The operator delete() requirement in the presence of anon-virtual destructor exists because in that case the compiler will supply the second argument - the size of the object - to the
delete operator based on the version of the class declaration ithas seen. Further restrictions are as for changes to private data members. Note that a class without constructors gets a compiler-generated,
public default constructor and a class withouta copy constructor gets a public default copy constructor. (Constructor generation, however, can be inhibited higher up in the
class hierarchy, as is the case for copy constructors in classesderived from CBase.)
You can widen the range of valid inputs to a function, or narrow the range of possible outputs.
You can’t change the interpretation of an existing valid input or change the meaning of an
existing output value. An enumeration can be added to but not reordered, say.
You can change the name and/or signature of a function if the change preserves or changesthe input or output ranges along acceptable lines. A non-const parameter can be made const, a reference to a
class can be changed to a reference to first base class, etc.As a single, unlikely exception, you can add a virtual function to a class or implement a previously inherited
virtual function that wasn’t public if classes derived from theprevious incarnation of the class wouldn’t have been able to be instantiated; ie, if the
class had only private constructors or had a private destructor. (Yeah, right.)I guess there’s a general flavour to these rules, which can be used to derive some guidelines
for defensive interface definition. It goes something like: A change is OK if eitherYou can pin down and fix every single line of code affected by it, and make sure that the
fixed code goes everywhere the change goes. This only works if no aspect of the change escapes from the
private domain.The change is demonstrably compatible with all possible clients, not just current ones or
ones you are aware of.
Unless you are confident that an interface will never need changing (this is a valid attitude,especially if you turn out to be right), you should be defensive about what leaks out of your interfaces. No information should escape
for no particular reason.Quite often information seems to make it out into the public domain by “accident”. Panic numbers,message numbers, purely private definitions such as hard-coded directory names, and indeed entire
private headers are but a few examples. Scores of private libraries have their import librariesreleased. Avoid doing this if at all possible. What you do not publish you do not have to freeze. Perhaps the most common violation of
this principle is overuse of the protected keyword. Protectedis often just slapped on by default, on the grounds that it allows more flexibility. The reverse is
true. Unnecessary protected interfaces have to be supported in perpetuity just as legitimateones have to be. Protected belongs only in classes designed as base classes in a framework.
Perhaps another thing that is apparent from the above is that the options with virtual functionsare severely limited. In frameworks with high fluidity it may therefore be appropriate to add in
one or two “spare” virtual functions. A restricted class of changes may be supported by pressingsuch a spare into service. If a framework suddenly, courtesy of a new category of concrete classes, acquires
new attributes along a new “dimension”. As a contrived example, should controls all of asudden require a degree of transparency, so that they can be layered with lower layers filtering
through, then a spare virtual function could be given a default behaviour that suits existing (ie, opaque) controls and new, transparent controls can be introduced. This is certainly nopanacea but may be useful in some cases.
?
?
Popularity: 4%