Friday, February 8, 2008

The truth about Microsoft Office compatibility

Stéphane Rodriguez, February 2008

 



The Microsoft Office product team rationale for designing such a poorly engineered file format known as OOXML is that, according to public statements they made, they do this not to advance the state of the art in Office document models, but to bring the two-decade worth of legacy and bugged if not broken features into the future.



They have a commercial reason to do so. According to their numbers, there is a 400,000,000 user install base. But with the introduction of the new file format OOXML and the accompanying application that they rushed out the door in order to ship in line with Windows Vista, is it true that Office 2007 is Microsoft's best thing since sliced bread when it comes to compatibility?



1) There is no Office 2007 64-bit edition

2) No support for 64-bit addins, user defined functions, existing ActiveX controls, OLE servers or even managed code (.NET).

3) No support for VBA in Excel services, part of their server suite

4) No support for VBA in Office 2008, the Mac version

5) Reduced speed of Office 2007 on 64-bit operating systems

6) Limited memory space of Office 2007 on 64-bit operating systems

7) Reduced legacy code compatibility of Office 2007 on 64-bit operating systems

8) OOXML supports 64-bit at all?

9) An open standard excluding competitors

10) Reverse engineering a necessity for compatibility reasons

11) No compliant file format




 


1) There is no Office 2007 64-bit edition



As amazing as it sounds, Microsoft didn't bother ship a 64-bit version of Office 2007, a version that would run natively on one of the 64-bit operating systems that they have made available on Windows (XP 64 edition, 2003 x64 edition, 2003 Itanium edition, 2008 server).



64-bit computers have become mainstream, and demand is growing very fast.



Microsoft mentions it on their website (Office system requirements) :


System requirements overview



The 2007 Microsoft Office system programs client is a 32-bit application and can run on a Windows 64-bit platform (Windows XP, Windows Server 2003, and Windows Vista) but there may be some feature limitations as noted in the system requirements below. (...)


The company that ships both Office and Windows couldn't find a good reason to provide a native support to their customers using 64-bit operating systems. Customers will have to run in emulated mode, otherwise known as WOW64 so that Office runs at all.



A number of limitations make it impossible once for all to use this product with the confidence that existing applications and add-ins work as intended. Despite compatibility claims. To add insult to injury, a number of performance issues arise as well, making hard to talk about compatibility without sounding like a liar. If you are in the process of purchasing Microsoft software, you may want to hold back a bit and perhaps consider an alternative.





The Microsoft Office compatibility matrix (true for all Office versions)


 

We are going to get in such details.



 


2) No support for 64-bit addins, user defined functions, existing ActiveX controls, OLE servers or even managed code (.NET).



Microsoft has not touched VBA, the scripting language that so many business users are fond of, for a number of years. In fact, it's been 7 years that they simply chose to outsource the maintenance of VBA (i.e. no more development) to Summit Software, a consulting company. VBA is at the heart of the so-called Office platform. It's what appeals to customers in need of custom applications built around Office. The fact that VBA is not being developed anymore should give anyone the assurance that anytime Microsoft ships a revision of Office, all applications based on VBA automatically work. Well, there is one problem : there isn't a 64-bit version of VBA.



If you are using a VBA-based third-party solution that depends on anything native of the 64-bit operating system, and get it loaded as part of Office, it simply won't work.



In retrospect, one has to ask, is the lack of 64-bit VBA the reason why Microsoft does not ship a 64-bit edition of Office ? This question is worth asking. If the answer is yes, it means Microsoft is lagging behind its own technology, and they are unable to meet the needs of anyone using a 64-bit operating system to take advantage of the 64-bit improvements over 32-bit.



What applies to VBA, applies to all user defined functions and even managed code (.NET) as well. No 64-bit user defined function or 64-bit managed code assembly can be loaded as part of a 32-bit image (Winword.exe, Excel.exe, Powerpoint.exe). As a result, your solutions and third-party solutions cannot take advantage of Office and the 64-bit operating system advantages simultaneously.



 


3) No support for VBA in Excel services, part of their server suite



Microsoft ships Excel in two editions, a desktop edition (Excel 2007) and a server edition (Sharepoint server 2007 which includes a server, Excel services). Even though Microsoft ships a 64-bit edition of the server suite, you still cannot use VBA as part of Excel services.



Why it is so? Simply VBA was designed as a STA COM apartment, Microsoft novlang to mean single-threaded. Excel services is designed to distribute calculations and serve users across threads (fundamentally multi-threaded architecture) and machines, so a single-threaded component cannot be part of this architecture.



The consequence is, should you consider using Excel services instead of the desktop of Excel to meet your needs, you automatically lose the ability to open reliably any spreadsheet embedding solutions and third-party solutions using VBA. For instance, Financial related add-ins shipping by default with Excel 2007 won't run.



 


4) No support for VBA in Office 2008, the Mac version



When Microsoft shipped the Mac version of Office 2007 in January this year, known as Mac Office 2008, it had no qualms taking away the support for VBA. Making any VBA-based solution broken on the Mac.



In the top 5 issues of the Microsoft website related to Mac Office 2008, it says :

My Visual Basic macros don't work



Cause: Office 2008 for Mac cannot run Visual Basic macros or load add-ins that contain Visual Basic macros.





Solution: Keep the macro in the file.

Solution: Remove the macro from the file.

Solution: Save the macro in another macro-enabled file format.

Solution: Create a new macro by using AppleScript.



Apparently, it does not matter so much that the ease and essence of VBA-embedded macros in documents employed by business users met their needs. Remember, VBA macros are a clever way for users to customize solutions without having IT people interfering. A remarkable feature is that, by embedding VBA macros, the deployment of solutions is trivial and does not involve the hassles of IT department. A sharp contrast with .NET, Microsoft's proposal for the future, which uses external files (assemblies) and additional rules for running : impossible to deploy without IT people.



 


5) Reduced speed of Office 2007 on 64-bit operating systems



When an application runs in WOW64, the emulation mode, it marshals memory access back and forth to adjust pointer sizes. Thereby creating a performance hit. Especially crucial for Word pagination, Excel calculations and Powerpoint animations.



Compatibility is therefore somewhat subjective. If your spreadsheet used to calculate within 10 minutes, and now it take twice the time, it's clear the in companies where speed is crucial, it is regarded as a bug that must be fixed.



One of the advantages of the 64-bit CPU is that whenever an application is natively 64-bit it maximizes the use of the increased register set. But when the application is running in WOW64, it has to restrict itself with the typical 32-bit CPU registers, and therefore unable to take advantage of the speed increase obtained by an increased register sets (as a result of less stacking).



 


6) Limited memory space of Office 2007 on 64-bit operating systems



Another advantage of a 64-bit operating system and CPU is the ability to run processes using a 64-bit wide address space. We are talking an infinite memory space, so to speak. But in 32-bit, and thefore in the WOW64 emulation mode, none of that is possible and a running process is restricted to 4GB of memory space, with 2GB preempted by the kernel.



Expecting to work with larger spreadsheets or documents gets stopped in the starting blocks. Making a purchase of such system is therefore not economically sound.



 


7) Reduced legacy code compatibility of Office 2007 on 64-bit operating systems



As per Microsoft own website,



Q. Are there any features in the 32-bit versions of Windows that are not in Windows Server 2003 x64 editions?


A.



A small number of features are not included in x64 Windows, including DOS, POSIX, 16-bit support, and a few legacy networking protocols no longer in active use. However, we do not expect most customers to be affected by these differences. Based on customer feedback, we expect the initial x64 usage scenarios for Windows Server 2003 to be databases, business applications, Terminal Server, Active Directory, Internet Information Services (IIS), and technical computing.


If you are using old Excel 4.0/5.0 modules, user-defined functions built on 16-bit code, that work with a 32-bit operating system, you better not expect it to work on the 64-bit operating system with Microsoft Office. The reason why is that the WOW64 emulation mode does not marshall 16-bit pointer sizes. No compatibility, period.



In the real world, many businesses are using old 16-bit solutions on a daily basis. There is no migration path for them on 64-bit operating systems.



 


8) OOXML supports 64-bit at all?



If anything, the previous sections have illustrated that "Microsoft Office" and "64-bit operating systems" are strangers. As such, how does one make a relevant compatibility case of OOXML consumed or produced by a native non-Microsoft 64-bit application? How do we know that today's OOXML makes sense at all across platforms?



 


9) An open standard excluding competitors



The ECMA 376 proposal tries to make a case for compatibility with legacy formats, such as binary formats. Forgetting a moment that the new XML formats still contain a number of binary blobs and are therefore in direct contradiction with the proposal, the goal states that :



ECMA 376, Part 1, Section 2.1 The goal of this clause is to define conformance, and to provide interoperability guidelines in a way that fosters broad and innovative use of the Office Open XML file format, while maximizing interoperability and preserving investment in existing files and applications.


It's pretty clear that the justification for the proposal is the preservation of the investment in files and applications. The preservation of the investment in applications has been addressed in the sections above of this article, where we have seen that the reality is a little different : Microsoft can't seem to be able to interoperate across their own platforms. With regards to binary files, it is just the same, as we are going to see.



Microsoft Office 2007 keeps secret how it migrates a binary file into a new file. The use case is neither described, documented, illustrated in ECMA 376 even though that is the stated goal of...ECMA 376. Competitors are excluded from doing the same reliably. Is that what we call exclusive openness?



We are in a case where the stated goal and content of the body of the ECMA 376 proposal don't match. This case was brought by a number of national bodies. The only national body where the discussions were made public (Microsoft originally vehemently opposed the exposure) was the US INCITS review group. This critical question remained unanswered, and it is known and widely reported that the US INCITS ballot brutally changed from a negative vote (overwhelmingly, first ballot in July 2007) to a positive vote (almost unanimously, in August 2007) after Bill Gates made a phone call to the US secretary.



To review the US INCITS discussions, head over to :


US INCITS V1 archive, April
US INCITS V1 archive, May
US INCITS V1 archive, June
US INCITS V1 archive, July
US INCITS V1 archive, August
US INCITS V1 archive, September
US INCITS V1 archive, October



 


10) Reverse engineering a necessity for compatibility reasons



Microsoft made clear publicly that the design of the XML vocabulary was not done at the expense of performance (performance is in the DNA of their engineering techniques, or so they say), and that they worked towards minimizing the performance hits due to the XML parsing, whose footprint is actually hidden thanks to the ZIP compression, a technique introduced by Open Office, the open source Office suite.



While this induced considerations such as keeping XML names as small as possible, and avoiding redundancies through redirections, both at the expense of implementors, what Microsoft won't say is that they went as far as making an extensive use of default values, that they consequently do not even store in the file.



It is a problem. You cannot find information when it is not in the file! Microsoft has thousands of default values that they use in a running instance of Office 2007, values that can be overridden by explicit attribute values, with the implication that non-Microsoft applications must reproduce the same in order to be compatible. All this is implementation-specific at the expense of the file format, that's where document design interferes with application design. Performance is in their DNA, but two decades of closed product development have consequences. Obviously Microsoft never designed Office 2007 with the idea that they would submit their internal specifications to an international organization.



In other words, short of Microsoft providing the exhaustive list of such default values, the combinations not only trivial ones, there is no way a competing application can be reliably compatible without reverse engineering Office 2007. Which defies the point of the ISO proposal.




 


11) No compliant file format



There is no strict "Save As ECMA 376" in Office 2007. There is no way an agent (application, third-party, SOX) is going to be able to produce such files. If that was not enough, ECMA 376 itself is a moving target due to Microsoft's own incompetence in proposing a format that was ready for international standards bare minimum requirements.



How does one achieve a file format compliance under those conditions?



 


Conclusion : this article has demonstrated that any angle you take the problem, compatibility is more wishful thinking than reality. In fact, experienced personnel will recognize much of the evolution of Microsoft Office over the years. The problem is that Microsoft is going to ISO with it. It contradicts even the most trivial assumptions.
Fast-tracking the ECMA 376 proposal made no sense at all. Microsoft is urged to go back to the drawing board.