It is not clear to me what your goal is.
A compound document file is laid out as a file system with directories as
storage objects and files as streams.
The storage objects are not real directories. They represent individual com
storage objects.
The files are not real files they are data represented as a stream.
> //9) Save the file to disk.
> IPersistFile perFile =
> (IPersistFile)Marshal.GetObjectForIUnknown(ptrFile);
> r = perFile.Save(@"D:\Data\Result\Output.doc", true);
> //r = E_FAIL
Michael, my goal is to extract all documents embedded within any given
compound document. Here is the current revision of my code:
//6) Get a pointer to the IPersistStorage Interface.
//comObj = Word, Excel, etc.
Guid IID_IPersistStorage = typeof(IPersistStorage).GUID;
IntPtr pIUnk = Marshal.GetIUnknownForObject(comObj);
IntPtr ptrStorage;
Int32 r = Marshal.QueryInterface(pIUnk, ref IID_IPersistStorage, out
ptrStorage);
//7) Load the COM object from the storage.
IPersistStorage per =
(IPersistStorage)Marshal.GetObjectForIUnknown(ptrStorage);
r = per.Load(Marshal.GetIUnknownForObject(store));
//8) Create new storage and save to disk.
IStorage temp;
StgCreateDocfile(@"D:\Data\Result\Output" + count,
STGM.READWRITE|STGM.SHARE_EXCLUSIVE|STGM.CREATE, 0, out temp);
OleSave(per, temp, false);
I understand that this approach will not work for all documents. For
non-office documents I have a different approach where I where I write the
"CONTENTS" stream to the file system. This works for PDFs for example.
The above approach works perfectly for all office document types except
Excel. Excel documents are created, however they seem to be lacking some
information and will not open natively. I can "fix" them by dragging them
into IE, where they can be viewed, and then saving them. This work around is
not acceptable to me.
Excel storages are associated with the clsid of Excel.Application. However,
I can't get an IPersistStorage interface for this object, instead I need to
use the clsid of Excel.Sheet to get any joy here at all. Wondering if my
problem is somehow related.
Regards,
Heath.
> It is not clear to me what your goal is.
>
[quoted text clipped - 29 lines]
> "Contents" stream and use the windows file system api
> to write the file.
Michael Phillips, Jr. - 17 Jul 2007 01:13 GMT
> The above approach works perfectly for all office document types except
> Excel. Excel documents are created, however they seem to be lacking some
> information and will not open natively. I can "fix" them by dragging them
> into IE, where they can be viewed, and then saving them. This work around
> is
> not acceptable to me.
It works for excel also. However the excel spreadsheet is marked as hidden.
There are two ways to solve the problem.
1) Use OLE Automation to create an "Excel.Application" ole object and get
the global Windows collection and mark the hidden window as "visible".
This approach is not very fast as the Excel application must be present
on the system and it takes somes time to load. Using IDispatch is
cumbersome.
2) Use the IStream to find the "WINDOW1" record in the "Workbook" stream.
The record is defined as 0x003D. Once found you load the record.
The record has an options bit field.
Here is the structure:
typedef struct _WINDOW1_RECORD
{
//short window1Marker; // Should be 0x003D
//short recordSize; // size of record in bytes,
biff2-biff4(8bytes),biff5-biff8(18bytes)
short hpos;
short vpos;
short height;
short width;
short options; // bit 1, 0 - visible, 1 - hidden
short idxActiveWkSheet;
short idxVisibleTab;
short selectedWkSheet;
short widthTabBar;
} WINDOW1_RECORD;
It is a "short". If you set it to 0, and then save the stream, the
excel spreadsheet will be visible.
> Michael, my goal is to extract all documents embedded within any given
> compound document. Here is the current revision of my code:
[quoted text clipped - 74 lines]
>> "Contents" stream and use the windows file system api
>> to write the file.