|
File Formats and Legacy Object Unarchiving
An Object Lesson for Programmers
©2002 Andrew C. Stone All Rights Reserved
Most programs write data files. This is an article about how we used to write objects, some of the trouble we got into later, how I got out of that trouble, and a description of how to write a file with a format that will endure almost forever, be forward and backwardly compatible, and be universally parseable.
A long time ago in a place not too far away, disk space was a very precious commodity - witness my first mass storage purchase in 1986: a 10 Megabyte external SCSI hard drive for my Mac+. The cost was $600. Now you can get a 30 gig drive for around $130, three thousand times the amount of storage for a quarter of the cost. Back in those early dark days, compressed, machine-only readable binary streams made sense. And if space is still an issue because of scale, binary streams are still useful. But when it comes to life cycles of software and maximal interoperability between applications, a plain English table of keys and values that humans can read (such as XML) is a compelling archival format because documents are backwardly and forwardly compatible. What this means is that earlier versions of your program can open documents produced by later versions, although some information may be lost.
Moreover, you do not force your users into unacceptable proprietary data formats. Instead, Cocoa developers should support writing their objects in XML as a combination of dictionaries, arrays, strings, values and data. Data is the catch all for objects which cannot decompose into the simpler types or are best represented as an object stream, such as an NSColor or an NSTextStorage. The CoreFoundation classes which serve both Carbon and Cocoa can write standard objects as either UTF8 XML or as an even easier-to-read format, the old style NeXTStep property list:
{
BitDepth = 520;
DocumentHTML = {
BGColor = <040b7479 70656473 74726561 6d8103e8 84014084 8484074e 53436f6c 6f720084 84084e53 4f626a65 63740085 84016301 84046666 6666833f 4ccccd83 3f4ccccd 010186>;
CenterInTable = YES;
Coalesce = 0;
ColorFromView = NO;
HTMLClass = DocumentHTML;
Now, compare that readability and compactness to the XML version:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist SYSTEM "file://localhost/System/Library/DTDs/PropertyList.dtd">
<plist version="0.9">
<dict>
<key>BitDepth</key>
<string>520</string>
<dictionary>
<key>BGColor</key>
<data>
BAt0eXBlZHN0cmVhbYED6IQBQISEhAdOU0NvbG9yAISECE5TT2JqZWN0AIWEAWMDhAJm
ZgEBhg==
</data>
<key>CenterInTable</key>
<string>YES</string>
<key>Coalesce</key>
<integer>0</integer>
<key>ColorFromView</key>
<string>NO</string>
<key>HTMLClass</key>
<string>DocumentHTML</string>
XML adds the ability to write dates and number values to the standard set of dictionary, array, string and data. /Developer/Documentation/ReleaseNotes/Foundation.html explains this and other nuances about writing XML. Currently, we prefer the old style property list for cross-platform compatibility with OpenStep and Mac OS X Server versions of our applications.
Opening the Vault
The reason I know that property lists is the way to archive data is the amount of work I have had to do to convert old NeXTStep 3 object streams into something openable by Create® for OS X. For almost all AppKit classes, the ability to open their original NeXTStep counterpart class is already provided by Apple. For example, an NSColor can open an NXColor struct, and NSPrintInfo can open archived PrintInfo objects. Where this all falls down is in View: NSView cannot open View objects because the Objective-C List object does not have an -initWithCoder: and subviews were stored in a list. This can be verified by reviewing the Darwin source code. This affected NeXTStep applications such as Create, NeXTStep Draw, and other applications which stored the data model in a binary typed stream as a list of graphics owned by GraphicView, a subclass of View:
typedstreamÅ¢ÑiÑ@ÑÑÑ PrintInfo
Black_WeaverÑÑNeXT 400 dpi Level II PrinterÑÑgoat
In hindsight, this is a stupid architectural decision - you shouldn't comingle data that forms the content of your document with the way you display that data. And, if you are an object purist such as this wizened old timer, you are not content until the view, the model and the controller components are all neatly separated. Luckily, if you use Cocoa's NSDocument architecture, you will automatically do the right thing!
The File Upgrader
Create has a separate, included application named "CreateDocUpgrader.app". By adding a Copy Files... phase to your Project Builder project, you can include other builds and products in your main application, such as the CreateDocUpgrader.app. Create registers for the old NeXTStep Create's file extension, ".create" in Project Builder:
1. Click "Targets"
2. Select your application
3. Click on the "Applications Settings" in the Target inspector pane
4. Click on your document type
5. Add the new extension to be opened in the Extensions field, separated by spaces
When a file is double-clicked, Finder will ask your app to open it. Your NSApplication's delegate class implements the following method to call on the included upgrader application:
- (BOOL)application:(NSApplication *)sender openFile:(NSString *)path
{
NSString *ext = [path pathExtension];
if ([ext isEqual:@"create"]) {
NSString *upgrader = [[NSBundle mainBundle] pathForResource:@"CreateDocUpgrader" ofType:@"app"];
if (upgrader) {
return [[NSWorkspace sharedWorkspace] openFile:path withApplication:upgrader];
} else NSRunAlertPanel(UPGRADER_TITLE,UPGRADER_MSG,OK,NULL,NULL);
} else ...
The Upgrader is launched and asked to open the legacy file. In the Upgrader's NSApplication's delegate, you implement:
- (BOOL)application:(NSApplication *)sender openFile:(NSString *)path
{
return [self openLegacyFileAndAutoSave:path];
}
We want to convert the file and then open it with Create when it completes:
- (BOOL)openLegacyFileAndAutoSave:(NSString *)file;
{
// get a new, unique file name based on the original name:
NSString *newName = [self newFileNameWithPath:file];
if ( [[self documentFromFile:file] saveFileToPath:newName] ) {
return [[NSWorkspace sharedWorkspace] openFile:newName withApplication:@"Create"];
}
NSRunAlertPanel(OPEN, CANNOT_OPEN_MSG, OK, NULL,NULL,file,newName);
return NO;
}
So, the real guts of the upgrader is the ability to read a typed stream by implementing initWithCoder: in all classes contained in the typed stream, and then to have those objects create an English property list representation, that is, the property list format that Create now uses. For the most part, you map the old classes to the new NS* classes by invoking a little rewiring magic in NSUnarchiver. Early in the launch cycle (applicationWillFinishLaunching: for example), include code like this:
[NSUnarchiver decodeClassName:@"PrintInfo" asClassName:@"NSPrintInfo"];
So, when the unarchiver hits the class named "PrintInfo" in the object "typed stream", it will instantiate an NSPrintInfo object, which knows how to open the older versions of this object in its -initWithCoder:(NSUnarchiver *)unarchiver method. For Create documents, we rewired the following:
- (void)applicationWillFinishLaunching:(NSNotification *)notification {
// Some classes just work by remapping:
[NSUnarchiver decodeClassName:@"PrintInfo" asClassName:@"NSPrintInfo"];
// we need to be able to unarchive NSImage's - and they just work as well!
[NSUnarchiver decodeClassName:@"NXImage" asClassName:@"NSImage"];
[NSUnarchiver decodeClassName:@"NXImageRep" asClassName:@"NSImageRep"];
[NSUnarchiver decodeClassName:@"NXBitmapImageRep" asClassName:@"NSBitmapImageRep"];
[NSUnarchiver decodeClassName:@"NXCachedImageRep" asClassName:@"NSCachedImageRep"];
[NSUnarchiver decodeClassName:@"NXEPSImageRep" asClassName:@"NSEPSImageRep"];
// we need our own List reader:
[NSUnarchiver decodeClassName:@"List" asClassName:@"MyList"];
// the classes NOT openable by Darwin/OS X:
// we'll write our own versions:
[NSUnarchiver decodeClassName:@"View" asClassName:@"MyView"];
[NSUnarchiver decodeClassName:@"Window" asClassName:@"MyWindow"];
[NSUnarchiver decodeClassName:@"Responder" asClassName:@"MyResponder"];
// Some private class we have to be able to read:
[NSUnarchiver decodeClassName:@"PSMatrix" asClassName:@"MyPSMatrix"];
}
Through trial and error handling, I figured out what View and Window were trying to read. When the Unarchiver reads through the stream, it checks to see what the next type archived is. If that is different from what you are trying to read, an exception is thrown with a message that explains what it expected versus what you attempted to read. I usually put a breakpoint in NSException's raise, so I can quickly backtrace to the line of code which caused the exception. To set this breakpoint in PBX's debugger, pause the execution of the program, and enter:
br -[NSException raise]
Because we don't really need an NSView or NSWindow, simply the list of graphics stored by the View subclass, we are only trying to read through the stream, keeping in synch with the objects therein, so we can get to our data model contained by the view. Here are minimal versions of the classes you will need to read old NeXTStep object streams:
@interface MyList : List
@end
@implementation MyList
- (id)initWithCoder:(NSCoder *)aDecoder
{
int version = [aDecoder versionForClassName:@"List"];
NSZone *zone = [self zone];
NS_DURING
if (version == 0) {
[aDecoder decodeValueOfObjCType:"i" at:&maxElements];
[aDecoder decodeValueOfObjCType:"i" at:&numElements];
dataPtr = (id *) NSZoneMalloc (zone, numElements*sizeof(id));
[aDecoder decodeArrayOfObjCType:"@" count:numElements at:dataPtr];
} else {
[aDecoder decodeValueOfObjCType:"i" at:&numElements];
maxElements = numElements;
if (numElements) {
dataPtr = (id *) NSZoneMalloc (zone, numElements*sizeof(id));
[aDecoder decodeArrayOfObjCType:"@" count:numElements at:dataPtr];
}
}
NS_HANDLER
NSLog(@"threw exception reading List: %@ : %@",[localException name], [localException reason]);
NS_ENDHANDLER
return self;
}
@end
@interface MyResponder : Object
{
id nextResponder;
id _reserved;
}
- (id)initWithCoder:(NSCoder *)coder;
@end
@implementation MyResponder
- (id)initWithCoder:(NSCoder *)aDecoder {
nextResponder = [aDecoder decodeObject];
return self;
}
@end
@implementation MyView
// we do use the frame and bounds:
- (NSRect)frame { return frame; }
- (NSRect)bounds { return bounds; }
- (id)initWithCoder:(NSCoder *)aDecoder
{
float f;
unsigned int version = [aDecoder versionForClassName:@"View"];
NS_DURING
[super initWithCoder:aDecoder];
[aDecoder decodeValueOfObjCType:"f" at:&f];
frame = [aDecoder decodeRect];
bounds = [aDecoder decodeRect];
superview = [aDecoder decodeObject];
window = [aDecoder decodeObject];
[aDecoder decodeValuesOfObjCTypes:"@ss@", &subviews, &vFlags, &_vFlags, &_drawMatrix];
NS_HANDLER
NSLog(@"threw exception reading View: %@ : %@",[localException name], [localException reason]);
NS_ENDHANDLER
return self;
}
@end
@interface MyWindow : MyResponder
{
NSRect frame;
id contentView;
id delegate;
id firstResponder;
id lastLeftHit;
id lastRightHit;
id counterpart;
id fieldEditor;
int winEventMask;
int windowNum;
float backgroundGray;
struct _wFlags {
#ifdef __BIG_ENDIAN__
unsigned int style:4;
unsigned int backing:2;
unsigned int buttonMask:3;
unsigned int visible:1;
unsigned int isMainWindow:1;
unsigned int isKeyWindow:1;
unsigned int isPanel:1;
unsigned int hideOnDeactivate:1;
unsigned int dontFreeWhenClosed:1;
unsigned int oneShot:1;
#else
unsigned int oneShot:1;
unsigned int dontFreeWhenClosed:1;
unsigned int hideOnDeactivate:1;
unsigned int isPanel:1;
unsigned int isKeyWindow:1;
unsigned int isMainWindow:1;
unsigned int visible:1;
unsigned int buttonMask:3;
unsigned int backing:2;
unsigned int style:4;
#endif
} wFlags;
struct _wFlags2 {
#ifdef __BIG_ENDIAN__
unsigned int deferred:1;
unsigned int _cursorRectsDisabled:1;
unsigned int _haveFreeCursorRects:1;
unsigned int _validCursorRects:1;
unsigned int docEdited:1;
unsigned int dynamicDepthLimit:1;
unsigned int _worksWhenModal:1;
unsigned int _limitedBecomeKey:1;
unsigned int _needsFlush:1;
unsigned int _newMiniIcon:1;
unsigned int _ignoredFirstMouse:1;
unsigned int _repostedFirstMouse:1;
unsigned int _windowDying:1;
unsigned int _tempHidden:1;
unsigned int _hiddenOnDeactivate:1;
unsigned int _floatingPanel:1;
#else
unsigned int _floatingPanel:1;
unsigned int _hiddenOnDeactivate:1;
unsigned int _tempHidden:1;
unsigned int _windowDying:1;
unsigned int _repostedFirstMouse:1;
unsigned int _ignoredFirstMouse:1;
unsigned int _RESERVED:1;
unsigned int _needsFlush:1;
unsigned int _limitedBecomeKey:1;
unsigned int _worksWhenModal:1;
unsigned int dynamicDepthLimit:1;
unsigned int docEdited:1;
unsigned int _validCursorRects:1;
unsigned int _haveFreeCursorRects:1;
unsigned int _cursorRectsDisabled:1;
unsigned int deferred:1;
#endif
} wFlags2;
id _borderView;
short _displayDisabled;
short _flushDisabled;
void *_cursorRects;
id _trectTable;
id _invalidCursorView;
id _miniIcon;
void *private;
}
@end
@implementation MyWindow
- (id)initWithCoder:(NSCoder *)aDecoder
{
unsigned int version = [aDecoder versionForClassName:@"Window"];
NSRect frameRect;
char *title;
NSZone *zone = [self zone];
short flags;
NS_DURING
[super initWithCoder:aDecoder];
frame = [aDecoder decodeRect];
if (version == 0) {
[aDecoder decodeValuesOfObjCTypes:"@@ifss*", &contentView, &counterpart, &winEventMask, &backgroundGray, &wFlags, &wFlags2, &title];
} else if (version == 1) {
[aDecoder decodeValuesOfObjCTypes:"@@ifss**", &contentView, &counterpart, &winEventMask, &backgroundGray, &wFlags, &wFlags2, &title,&_miniIcon];
} else if (version >= 2) {
NSColor *tmpColor;
[aDecoder decodeValuesOfObjCTypes:((version >= 4) ? "@@ifss*@s" : "@@ifss**s"), &contentView, &counterpart, &winEventMask, &backgroundGray, &wFlags, &wFlags2, &title, &_miniIcon, &flags];
tmpColor = [aDecoder decodeNXColor];
}
if (version >= 3) {
NSSize min, max;
char c;
[aDecoder decodeValueOfObjCType:"c" at:&c];
if (c & 1) min = [aDecoder decodeSize];
if (c & 2) max = [aDecoder decodeSize];
}
delegate = [aDecoder decodeObject];
firstResponder = [aDecoder decodeObject];
frameRect = frame;
frameRect.origin.x = frameRect.origin.y = 0.0;
NS_HANDLER
NSLog(@"threw exception reading Window: %@ : %@",[localException name], [localException reason]);
NS_ENDHANDLER
return self;
}
@end
// this is two matrices - probably for view rotation - but our view cannot be rotated, so we read and go on...
@interface MyPSMatrix: Object
{
float matrixElements[12];
unsigned short flags;
}
@implementation MyPSMatrix
- (id)initWithCoder:(NSCoder *)aDecoder
{
float f;
unsigned int version = [aDecoder systemVersion];
NS_DURING
if (version < 901) {
int temp;
[aDecoder decodeArrayOfObjCType:"f" count:12 at:&matrixElements];
[aDecoder decodeValueOfObjCType:"i" at:&temp];
} else {
[aDecoder decodeArrayOfObjCType:"f" count:12 at:&matrixElements];
[aDecoder decodeValueOfObjCType:"s" at:&flags];
}
NS_HANDLER
NSLog(@"threw exception reading PSMatrix: %@ : %@",[localException name], [localException reason]);
NS_ENDHANDLER
return self;
}
@end
So, the preceding code lets you get at your own objects, but consider the case where you have legacy files from OpenStep as well as NeXTStep. In OpenStep, we use the NSView hierarchy, so if you have custom class which can open both NeXTStep and OpenStep typed streams, then you'll have conditional code at the top of the -initWithCoder:
GraphicView:NSView
- (id)initWithCoder:(NSCoder *)aDecoder
{
int version = [aDecoder versionForClassName:@"GraphicView"];
if (version < 400) { // NeXTSTEP...
// we don't really use too much of what's in a view
// An NSView cannot open us up - therefore, we'll use our custom MyView to imitate:
id oldStyleView = [[MyView alloc]init];
NSLog(@"attempting to read NeXTSTEP object file");
[oldStyleView initWithCoder:aDecoder];
// we'll grab two ivars we care about:
_frame = [oldStyleView frame];
_bounds = [oldStyleView bounds];
// if you needed more info from the View, just implement more methods in MyView, and query here...
} else [super initWithCoder:aDecoder]; // An OpenStep object file
.....
// now, the normal unarchiving follows, which includes reading the list of graphics
....
Now, the GraphicView stored a list of graphics - so we want to be able to convert the old List object into a new NSMutableArray. That trick is achieved by this invoking line and NSMutableArray category method initFromList:
_graphics = [[NSMutableArray allocWithZone:[self zone]] initFromList:_graphics];
@interface NSMutableArray(Compatibility)
- (id)initFromList:(id)aList;
@end
@implementation NSMutableArray(Compatibility)
- (id)initFromList:(id)aList
{
int i, count;
if ([aList respondsToSelector:@selector(isKindOf:)] && [aList isKindOf:[List class]] ) {
count = [aList count];
[self initWithCapacity:count];
for (i = 0; i < count; i++) {
[self addObject:[aList objectAt:i]];
}
} else if ([aList isKindOfClass:[NSArray class]]) {
return [self initWithArray:aList];
} else {
/* should probably raise */
}
return self;
}
@end
Writing the Property List
So, with this code, you should be able to unarchive NeXTStep data models that included List, View, Responder, and Window classes. The one 'caveat emptor' is that NXDataLink, and family classes, do not really have a corresponding class in the AppKit, so if your application allowed automatically updating embedded links, you are going to have to do more reverse engineering! Meanwhile, this whole mess could have been avoided by storing your data as a dictionary, with keys and values for each instance variable. An excellent example of the use of property lists as an archive format is provided in the OS X Developer release in Sketch: /Developer/Examples/AppKit/Sketch.
Once you have the legacy objects read into memory, then you write them out in the property list format:
static NSString *SKTGraphicsListKey = @"GraphicsList";
static NSString *SKTDrawDocumentVersionKey = @"DrawDocumentVersion";
static int SKTCurrentDrawDocumentVersion = 1;
static NSString *SKTPrintInfoKey = @"PrintInfo";
- (NSDictionary *)drawDocumentDictionaryForGraphics:(NSArray *)graphics {
NSMutableDictionary *doc = [NSMutableDictionary dictionary];
unsigned i, c = [graphics count];
NSMutableArray *graphicDicts = [NSMutableArray arrayWithCapacity:c];
for (i=0; i<c; i++) {
[graphicDicts addObject:[[graphics objectAtIndex:i] propertyListRepresentation]];
}
[doc setObject:graphicDicts forKey:SKTGraphicsListKey];
[doc setObject:[NSString stringWithFormat:@"%d", SKTCurrentDrawDocumentVersion] forKey:SKTDrawDocumentVersionKey];
[doc setObject:[NSArchiver archivedDataWithRootObject:[self printInfo]] forKey:SKTPrintInfoKey];
return doc;
}
To write it out as ASCII, get the dictionary and return the NSData:
- (NSData *)drawDocumentDataForGraphics:(NSArray *)graphics {
NSDictionary *doc = [self drawDocumentDictionaryForGraphics:graphics];
NSString *string = [doc description];
return [string dataUsingEncoding:NSASCIIStringEncoding];
}
To save the actual file, simply override -dataRepresentationOfType:
- (NSData *)dataRepresentationOfType:(NSString *)type {
if ([type isEqualToString:SKTDrawDocumentType]) {
return [self drawDocumentDataForGraphics:[self graphics]];
} else....
}
Finally, each Graphic subclass needs to know how to respond to propertyListRepresentation - here's an example:
NSString *SKTImageContentsKey = @"Image";
NSString *SKTFlippedHorizontallyKey = @"FlippedHorizontally";
NSString *SKTFlippedVerticallyKey = @"FlippedVertically";
- (NSMutableDictionary *)propertyListRepresentation {
NSMutableDictionary *dict = [super propertyListRepresentation];
[dict setObject:[NSArchiver archivedDataWithRootObject:[self image]] forKey:SKTImageContentsKey];
[dict setObject:([self flippedHorizontally] ? @"YES" : @"NO") forKey:SKTFlippedHorizontallyKey];
[dict setObject:([self flippedVertically] ? @"YES" : @"NO") forKey:SKTFlippedVerticallyKey];
return dict;
}
Because each Graphic subclass encodes its name under the key SKTClassKey, Graphic uses code like this to instantiate the correct class:
+ (id)graphicWithPropertyListRepresentation:(NSDictionary *)dict {
Class theClass = NSClassFromString([dict objectForKey:SKTClassKey]);
id theGraphic = nil;
if (theClass) {
theGraphic = [[[theClass allocWithZone:NULL] init] autorelease];
if (theGraphic) {
// read all the values stored and reconstitute:
[theGraphic loadPropertyListRepresentation:dict];
}
}
return theGraphic;
}
Again, refer to the Sketch source code for a full implementation of reading and writing property lists.
Conclusion
The end result of using property lists as a file format is that humans can read them, edit them at will, and open them with other applications including earlier versions of the creating application. The advantages of property lists very much outweigh the only disadvantage which is file size: about twice that of a binary equivalent.
Andrew Stone, CEO of Stone Design Corp - www.stone.com, has twice the number of ZPG children and farms chiles and software from a tower along the Rio Grande in Albuquerque New Mexico. |