DevGlobalCache - A way to Cache and Share data between processes
By Natty Gur | Published: 19 June 2005 |
Reader Level: Advanced
A way to Cache and Share data between processes
Download source files - 474 Kb
Introduction
In the halcyon days of COM architecture, the most common way to share data
between processes was the use of a COM EXE class. A COM object, which runs in
its own memory space, enabled retention of the data with the code of the
process. This data could be shared between all COM objects created from the COM
EXE classes.
The above procedure was adopted by many of my clients when they needed to cache
unchanged data (for example, a list of cities) on the server. However, when
.NET was released, my clients and I looked for a way to achieve caching of
unchanged data as we did previously. Our intuition indicated that we look for a
.Net EXE class, but unfortunately, this was not found. What we discovered was
that, we can create an EXE class and connect it by remoting. However, when we
tested this approach, we ran into a performance setback. Deeper investigation
of this issue revealed that, opening a TCP port to and from the client site was
extremely slow. In addition, there was a further problem of the remoting
procedure, which needed to retain a constant listener (that listened to the
server port) to receive client requests. This made the task even more
complicated.
Bearing this in mind, it is obvious that infrastructure work becomes essential.
We need to create a mechanism that will enable us to share data between
processes in a timely manner.
There are some options for building this mechanism. We will discuss two of them:
-
We can serialize an object into a file on the disk and then de-serialize it
from any other process. There is no idyllic condition, and there is no way that
writing and reading to a file is faster than remoting.
-
Another solution is based on the Win32 Memory Map Files (MMF). MMF enables us
to read and write to files, but all the actual IO interaction occurs in the RAM
in a form of standard memory addressing. This has a performance advantage.
Another benefit of MMF is that, it also enables sharing across processes by
approaching the MMF with the same MMF global name (that was assigned at the
time of creating the MMF) from different applications.
It is obvious which option is the optimal.
Using MMF, we will build a mechanism that will enable us to share data between
processes with negligible performance decrease. The challenge will be to work
with MMF even though the managed code framework of .NET does not have any
classes to handle MMF. To do this, we will go back to working with unmanaged
code (WIN 32 API) via P/Invoke (interop). This article will demonstrate using
MMF in .NET, to create a global cache for a .NET application. We achieve this
by creating set of classes, that encapsulates the P/Invoke code to use MMF
functionality, and to encapsulate the logic behind using MMF. While writing and
reading from the MMF, we will gather our data between managed and unmanaged
memory.
Defining the problem
Sharing data between processes is one of the common problems that we face when
we write applications. There are two main ways that a programmer can use memory
sharing between processes:
-
Caching data: While developing several web applications and objects on the
server that need the same unchanged set of data from the database, we can
prevent the access of every application or object to the database to enhance
the server performance. By caching the data on initial access from the
database, we use the cache data and prevent the need to access the database. If
all the applications are web applications, we can cache the data in the aspnet_wp.exe
by using the Application object. However, if we need to share data between web
applications and objects that exist in other process, we need to find a way to
share the cache data between all those processes. This way of working with
cache is characterized by data that is entered once and seldom changed. If the
cache is changed it is usually changed from the same process that created it.
-
Maintaining state: While developing a stateless application, such as a web
application, the developer needs a way to save the state of the application
between calls. The state data is actually a set of data that we need to keep
from one call of the application to another in order to maintain the
application flow. This data usually consists of a small chunk of data (such as
user id and user password or the last action performed by the user at the
application). If the developing application is based on web pages the Session
object is able to supply this functionality. There are two scenarios when the
Session object will fail:
-
The first occurs when the object being developed is running as a server
application in COM+. In such a scenario the data that exists in the Session
object cannot be obtained from the object being developed. In .NET the web
context cannot be reached from the COM+ context.
-
The second scenario occurs when the application is hosted on a web farm. In
this situation, one call from the client can reach one server and the next
client call can reach another server. You have probably guessed that if you set
your state data on the first server, it will not be available from the other
server. The main difference between the state data and the cache data is that,
state data is usually a small chunk of data that changes rapidly.
Sharing data between processes in a system that does not allow one process to
access the memory of another process is a recognized problem for which Win32
has answers (pipes, mail slots, memory map files, etc.). COM presents a simpler
solution. If we create a COM EXE server, all the instances that are created
from it by other processes, exist in the same memory area that was allocated
for the COM EXE process. In this situation if we save data in the process. This
data can be shared among all the instances. This is easily achievable, as you
will see in the code that demonstrates caching of Recordsets. All that is
needed is to declare the candidate cache data with the code in the EXE module
class, and then by using the extern_Module to access the data from the instance
code:
typedef map<_bstr_t,_RecordsetPtr> DATAMAP;
typedef DATAMAP::value_type vtData;
typedef vector<DATA, allocator<Data> > DATAVECTOR;
{
public:
LONG Unlock();
DWORD dwThreadID;
HANDLE hEventShutdown;
void MonitorShutdown();
bool StartMonitor();
bool bActivity;
DATAMAP DataMap;
DATAMAP::iterator MapIterator;
DATAVECTOR CachDataVector;
DATAVECTOR::iterator CachDataIterator;
HANDLE hMutex;
};
extern CExeModule _Module;
As you have probably guessed, if we leave out all the necessary code to prevent
multi threading access to the shared data, we will encounter problems.
While programming with COM, I use the COM EXE approach to cache rarely changed
recordsets from databases as well as a mechanism to maintain state data between
application calls in stateless applications on the client, or on the server
side. When I started to write the new infrastructure for .NET, I looked for a
solution for those problems. At first glance it seemed that, writing an EXE
class with the same machinery would be acceptable. However, I soon discovered
that there was a performance problem when I tried to put the machine under
stress. The remote connection between the aspnet_wp.exe and the EXE
which held the cached data, was not fast enough. What was needed was, a way to
share data between the processes with minimum impact on the system performance.
Defining the solution
The solution to the problems is obtainable in the problem description. The data
that I put in the code of the COM EXE is shared among all the instances of the
COM EXE classes, because Win32 is actually holding one copy of the COM EXE code
in the physical memory. Every process created from the EXE is mapped into this
physical memory space. By using this characteristic from managed code, we can
develop a satisfactory solution.
Share memory is one of the options supplied with Win32, to handle inter-process
communications (IPC), but there are also other options. Initially, we should
therefore examine the IPC types to see if there are any other options that can
used to obtain a solution.
To craft working assemblies that will supply the request functionality, we first
need to divide the work into three main blocks:
-
Build a class that will hold all the interoperability declaration needed to
communicate with the memory map file Win32 functionality. Microsoft builds lots
of classes in the CLR, but MMF functionality is not part of CLR classes.
-
Build a class that will be an encapsulation of the MMFs. Instead of working
directly with the API, we will create a class that will govern working with
MMFs. This class will hold the required internal data and functionality used in
the solution
-
Maintaining the logic. The solution is not just writing and reading to/from MMF
files. The solution also maintains logic to share data between processes by
using MMF. The solution allows the programmer to insert, retrieve and delete
objects from / to MMF. In order to provide this functionality there are some
issues that need to be addressed. For example, we need to maintain the name and
the physical location of the objects that the programmer adds. This information
is necessary for retrieving and/or deleting those objects.
We will place these three code blocks into a DLL. In this way, every process
that will attach the DLL into its memory area will be able to use its
functionality to store and retrieve data from memory map files.
At first I thought to show in this article, how to share the cached data across
all the web farm servers. This functionality is necessary for maintaining state
across web farms servers. As this aspect requires a very lengthy discussion, it
could be the subject of a future article.
As a result of the complexity of this solution, initially we will examine all
the classes to be created and it’s the main task of each class in this
solution.
-
Win32 APIs holds all the declaration of P/Invoke to enable the call of
unmanaged code.
-
MemoryMapFile encapsulates the MMF functionality. This class can be used
independently to access MMF.
-
MapViewStream encapsulates stream that can read and write to / from the memory
address of the MMF. This function is the core of this solution. It enables us
to read and write objects in managed code from the unmanaged memory.
-
DevCache holds the logic that we need in order to allow the programmer a simple
activation of the machinery.
-
FileMapIOException customizes the exception class that raises the Win32 errors
while working with MMF.
-
OFSTRUCT shows whom to sent structure to Win32 API that will be returned with
data.
Component design and coding
What are MMF and how do they helps us?
In the COM solution we are using one of the inter-process communications (IPC)
available in Win32, the memory map file. As was indicated above, MMF will be
used in our solution. We need to find a way to share data between processes and
it appears that MMF is suitable. However, we first examine all the IPC types to
determine if this is the right decision and if there are any other IPC types
that can be used in this solution.
IPC types
The IPC types can be split into two main categories: local and network. The
local IPC is designed for communication within the machine. The network is used
for cross machine IPC. The local IPC consists of:
-
Atoms are string or integer. They are identified by the handler, and can be
reached from every process. They were originally created for DDE. Win32 limits
them to 37 strings and integers only.
-
Shared memory (Memory map files-MMF) is a mechanism of Win32, that enables
several processes to share the same physical memory area. Win32 is built in
such a way that a process cannot access other processes' memory area. This is
achieved by giving every process, 2GB of virtual memory that is private to the
process. Win32 is able to share memory between processes by using the process
virtual memory. The shared data is placed in one physical memory area and every
process has a pointer to its virtual memory that indicates the physical memory.
This mechanism can control access of processes to the physical memory area for
reading or writing. Win32 enables us to map physical files or portions of files
into virtual memory. This is the reason that this mechanism is also called
memory map files. This unique feature enables us and the operating system to
access files in the same way that we access memory.
-
Mutex is an object that can be owned by one thread at a given time. Other
threads that try to take ownership on the mutex are blocked in a queue until
the current owner of the mutex releases it. Mutex is a good option to control
the access of threads to resources as just one thread at a time can access the
resource. We will use mutex to control the access to the sheared memory area
from processes.
-
Semaphore is like mutex but the number of threads that can access a resource
can be set. All other process are blocked and queued.
-
Critical sections' behavior is the same as mutex. Their limit is that they
cannot be shared between threads from difference processes. Their advantage is
that they are lighter and faster than mutex.
-
Events are objects that can be used to signal threads when it changes its
state. Events can signal that we have finished with a resource, or
initialization, or that the thread is ready to get data.
-
Other types of local IPC that can be used include the window messages, DDE,
clipboard and as mentioned above, COM.
Network IPCs can be:
-
Network protocols (NetBEUI, TCP/IP) are the flexible option to transfer data
between processes in several machines. The problem is that they require a great
deal of programming (listening, receiving of multi request etc.). Their
advantage is that every solution used, is actually dependent on one network
protocol. The best protocol is TCP/IP. This is the most flexible and widespread
between operation systems.
-
Mailslots are like WIN32 messages but they can be sent to other machines across
the network. Mailslots are limited in the amount of data that can be sent, like
Win32 messages. They can also be sent to all the machines in the domain as
datagram (UDP). That means that there is no guarantee that the receiver will
actually receive the data.
-
Pipes. If mailslots are like UDP, pipes are like TCP. They communicate with two
end points so that once set, data can be sent via the pipe between the two end
points.
-
Another option is RPC (Remote Procedure Call) that enables the calling
procedure in the remote machine and DCOM (which is the option used to activate
remote COM objects).
Examining all the IPC options, it becomes obvious that, the memory map files are
the best choice to store data that can be shared among several processes on the
machine. This is the only option that allows us to share large amounts of data.
In addition to using MMF, we will use mutex to synchronize the write access of
processes to the memory in this solution. Our next step will be to understand
how MMF is working and how we use it from unmanaged code. This knowledge will
be used to build a class, that will hold all the P/Invoke code needed to
activate the unmanaged code from managed code.
Converting Win32 API to .NET
In this article we will use the MMF only to map files. However, the Win32 API
will also allow us mapping of the system page file. On reaching the declaration
of creating file mapping, the mapping of system page file will be shown. The
main reason for mapping files is that, Windows actually updates the underline
file every time we write to the MMF. This is performed by the operating system,
which means we are not aware of the update or of our performance damage. At the
end of the process, we receive a file with our data. We can use this file to
maintain the state even if the computer shuts down.
The first step here is to create a file that eventually will be mapped into the
memory so we can read and write to it. To create this file, we are going to use
the Win32 API CreateFile. Although System.IO. FileStream can also be used, this
article is dealing with P/Invoke, and this function will demonstrate P/Invoke.
The API declaration of this function is:
HANDLE CreateFile(
LPCTSTR lpFileName, // file name
DWORD dwDesiredAccess, // access mode
DWORD dwShareMode, // share mode
LPSECURITY_ATTRIBUTES lpSecurityAttributes,// SD
DWORD dwCreationDisposition, // how to create
DWORD dwFlagsAndAttributes, // file attributes
HANDLE hTemplateFile // handle to template file
);
To use this function in .NET, we need to convert all the C types into .NET CLS
types. To do this, we will create new class WIN32MapApis. This class will hold
the entire needed code for the Interop with Win32 API. To deal with the
unmanaged function that imports from DLLs, we use some of the classes in the
System.Runtime.InteropServices. We can import the namespace by using the Using
keyword. Using importing lets us reference a class by its name, else, a full
class qualifier would be needed. We will import the name space to simplify our
code (but be aware that when you import a namespace, you actually enlarge your
DLL size. This may have a negative impact on performance).
While writing the P/Invoke code, using attributes is very common. Attributes
are classes that we apply to any target element (Assembly, Class, Constructor,
Delegate, Enum, Event, Field, Interface, Method, Module, Note, Parameter,
Property, ReturnValue, and Struct). By applying the attribute to the class, we
can declare our intentions. Declarative programming has excited me for a long
time. From the days of DOS, we were able to use DumpBin to see the intentions
of the programmer regarding his code. This trend continues with COM, that
enables us to enter more data about our intentions by using the TypeLibrary
metadata. COM+ has taken this one step further by using attributes, but its
implementation was a tedious task. .NET makes the implementation easier and
allows the use of attributes to declare our intentions. When we declare an
intention, we are actually declaring signs (attributes) that have been
published using a pre-defined meaning. When those signs are applied to an
element by the handle element code (CLR code or programmer), the handle code
can take into account the predefined meaning of this sign. The attribute can
then instruct to start a transaction, or apply serialization on the element, or
any other pre-defined meaning:
[ DllImport("kernel32", SetLastError=true, CharSet=CharSet.Auto ) ];
The DLLImport attribute tells us the runtime of the function being applied and
is imported from external DLL that consists of unmanaged code. For such a
function, properties of the attribute can be set to help the runtime handling
of this function. By using the attribute we can tell:
-
The name of the DLL that holds the function.
-
The calling convention of the function
-
The way the string will be marshalled (CharSet) across managed and unmanaged
code
-
The entry point that holds the actual name of the function.
-
An indication if the called API function will call the API SetLastError before
returning and other options:
public static extern IntPtr CreateFile (
String lpFileName,
int dwDesiredAccess,
int dwShareMode,
IntPtr lpSecurityAttributes,
int dwCreationDisposition,
int dwFlagsAndAttributes,
IntPtr hTemplateFile )
After compiling the DLLImport attribute with the properties that we need, we can
start to declare the function. The extern keyword tells the compiler that this
function was implemented in another DLL and in a language that did not support
CLS. As this is just a declaration of a function without any implementation,
there is no body to the function declaration, and it ends with semicolon. The
static extern indicates the runtime to automate LoadLibrary and GetProcAddress.
The IntPtr structure represents the pointer or handle (that is platform
independent). We use this structure in P/Invoke every time we need to represent
a pointer or handler in our C# code. You are probably aware that the Win32 API
contains many pointers (to string, to structure etc.) and handlers (files,
windows, etc.) For this reason, the structure is widely used in P/Invoke. If we
are required to pass null to a parameter that was declared with IntPtr, the
correct way to do this is to use IntPtr.Zero instead of null. In this
declaration we use the IntPtr to represent the file handler that will be
received from the operating system and the pointer to the structure of the
security attribute. As we progress with the article, I will demonstrate how to
represent a structure. This task requires code and time. Therefore, if you know
that you are going to pass null to the pointer, you can save both work and time
by declaring the pointer to the struct as IntPtr.
The declaration of the CreateFile function is shown below. As will be seen, the
details discussed appear in the declaration. All the data types that have the
same definition in the CLS and API (like int, long etc.) will not require any
work in order to marshal them:
[ DllImport("kernel32", SetLastError=true, CharSet=CharSet.Auto ) ]
public static extern IntPtr CreateFile (
String lpFileName,
int dwDesiredAccess,
int dwShareMode,
IntPtr lpSecurityAttributes,
int dwCreationDisposition,
int dwFlagsAndAttributes,
IntPtr hTemplateFile )
The next step after creating a file, is to create an MMF object that actually
maps the file to the memory area. Generally, the CreateFileMapping receives the
file handle, with an access permission attribute and creates the MMF object:
[ DllImport("kernel32", SetLastError=true, CharSet=CharSet.Auto) ]
public static extern IntPtr CreateFileMapping (
IntPtr hFile,
IntPtr lpAttributes,
int flProtect,
int dwMaximumSizeLow,
int dwMaximumSizeHigh,
String lpName );
There is nothing new in the declaration regarding the P/Invoke, but we need to
understand the function parameters. The first is the handle to the file that we
received in the previous function. If we will pass to this parameter
0xFFFFFFFF, we will map system page file. The second is the security attribute.
We will pass null and give up the security settings. The third parameter
represents the protect level of the file view when it is mapped. To make the
code more readable, create an enumeration that consists of all the protection
options. The Flags attribute indicates that this enumeration can be treated as
bit field. As such, we can implement a bit wise OR on the bit fields
(enumeration fields).
[Flags]
public enum MapProtection
{
PageNone = 0x00000000,
// protection
PageReadOnly = 0x00000002,
PageReadWrite = 0x00000004,
PageWriteCopy = 0x00000008,
// attributes
SecImage = 0x01000000,
SecReserve = 0x04000000,
SecCommit = 0x08000000,
SecNoCache = 0x10000000,
}
The fourth and fifth parameters are responsible for setting the size of the file
mapping. To solve the Windows 32-bit addressing limitation the API uses those
parameters. There might be occasions when a bigger mapping size is needed, in
which case we can set it with a 32-bit address space. With two parameters, we
have 64-bits to set the mapping size. Naturally, for files that are less then 4
GB we don’t need to use the fourth parameter and so we will set it to 0. If we
set the two parameters to 0, the length of the map file will be the length of
the file. If we set the file map length, we must pay attention to the mapping
file size, which should not be smaller than the file size. The last parameter
is an important one. This parameter sets the name of the file that will be the
unique identifier of the map file in the entire operating system. In this way,
if more then one process uses this name to open the file, mapping will be
mapped to the same file.
When mapping to the system page we need to work around Microsoft design. The
first parameter is the IntPtr structure. This structure represents int32 for
Win32 and int64 for the Win64 operating system. We are working on a Win32
operating system, so when we try to send 0xFFFFFFFF as a parameter, the CLR
will not allow us to compile because of overflow. To prevent this, we will
overload the CreateFileMapping with another hFile parameter that receives uint
as the first parameter.
[ DllImport("kernel32", SetLastError=true, CharSet=CharSet.Auto) ]
public static extern IntPtr CreateFileMapping (
uint hFile,
IntPtr lpAttributes,
int flProtect,
int dwMaximumSizeLow,
int dwMaximumSizeHigh,
String lpName );
Creating memory file mapping with a unique name can be done by one process. Many
processes can create many file mappings and obtain their MMF HANDLE. But this
is not the case in this instance. To share data between processes, they need to
share the same MMF. This sharing can be achieved by allowing any other process,
except the process that opens the MMF, to use a unique name for opening an
existing file mapping. By using the OpenFileMapping function, any process can
open an existing MMF. This function declaration is simple:
[ DllImport("kernel32", SetLastError=true, CharSet=CharSet.Auto) ]
public static extern IntPtr OpenFileMapping (
int dwDesiredAccess,
bool bInheritHandle,
String lpName );
The first parameter here sets the access rights to the file mapping. As with the
protection levels, we create an enumeration that will hold all the options. The
second parameter indicates if the child process will inherit this file-mapping
handle. The last parameter is the unique name that we use in the
CreateFileMapping function.
The next step after obtaining the handle to the memory map file is to view all,
or a part of the memory map file within the process. This operation creates a
memory area in the process memory area, which points to the map file object. To
do this, we need to declare the MapViewOfFilefunction:
[ DllImport("kernel32", SetLastError=true) ]
public static extern IntPtr MapViewOfFile (
IntPtr hFileMappingObject,
int dwDesiredAccess,
int dwFileOffsetHigh,
int dwFileOffsetLow,
int dwNumBytesToMap );
The first parameter is the handle of the MMF that we obtain through
CreateFileMapping or OpenFileMapping. The second parameter is the access mode
to the file that uses the above-mentioned enumeration. The third and the fourth
parameters are used together as a 64-bit offset to set the beginning of the
mapping. The last parameter sets the length of the file to be viewed. The last
parameter can be 0 to view the entire file (in this case the third and fourth
parameters are irrelevant). The function returns a pointer to the address of
the memory in the process where the file is being mapped. The MapViewOfFileEx
provides the capability of mapping the file to a specific set of address.
Changes made to the memory area of the process map view are updated by the
operating system to the physical memory and to the file when we view unmapping,
or when the file-mapping object is removed. If there is a need to make the
update immediately, you will need to call the FlushViewOfFile function:
[ DllImport("kernel32", SetLastError=true) ]
public static extern bool FlushViewOfFile (
IntPtr lpBaseAddress,
int dwNumBytesToFlush );
This function gets the base address of the view in the memory as the first
parameter and the length of the view map file that needs to be written as the
second parameter. To un-map the view you need to call:
[ DllImport("kernel32", SetLastError=true) ]
public static extern bool UnmapViewOfFile ( IntPtr lpBaseAddress );
The only parameter here is the base address of the view of the file. Finally to
release the memory map file object, we will use the CloseHandle function:
[ DllImport("kernel32", SetLastError=true) ]
public static extern bool <CODE>CloseHandle ( IntPtr handle );
As stated in the beginning of the section, we created a wrapper class that
encapsulates all the API function that we need in order to use the MMF
functionality.
Wrapping the MMF in a classes
Till now we have created a wrapper static class that encapsulates all the
P/Invoke work needed to work with the Win32 functions. The solution is the use
of MMF as machinery to store data so it can be shared between processes. This
solution is based on mapping files (Although using of system page file is also
possible). In this section we will build classes that will hold the
functionality and data that we need to use memory map files in our solution. In
fact, we are going to build two main classes. The first one will hold all the
functionality and data needed to: open file, creating and open memory map file,
mapping and un-mapping view of file, and closing all the handles that were
open. The second class will deal with the reading and writing data from the
mapping view of the file (the virtual memory address of the hosting process)
and flushing the memory map file.
The memory map file class
The MemoryMappedFile class will hold all the functionality and data needed to
operate the MMF. This class implements the IDisposable interface. It is a good
idea to allow the user, the ability to free all the handles used by the class
whenever he requires this. As private data, we will hold the handle of the MMF
object, which we receive through OpenFileMamming and CreateFileMapping. We need
this data to MapViewOfFile and to close the MMF object. We will also keep the
size of the map view to be used while marshalling data between the managed and
unmanaged code.
There are two sequences that might exist when we want to create a new instance
of our class. The first is to create the object and use its functionality to
obtain the base address of map view later. The second is to create the object
with all the needed parameters so that we get the MMF handle at the end of the
constructor. The logic behind the parametric constructor is simple: We first
try to get a handle to the existing MMF by using its unique name. If we do not
receive a valid handle, we will create a file for the given file name, and then
create file mapping to the file using the given parameters. When the
constructor has finished successfully, our class holds the handle to the MMF
object. If someone has already opened an MMF object with the given name this
function, execution time will be fast. If not, we need to do additional work
(open file and mapping to it) that will result in a longer executing time
public MemoryMappedFile( String fileName, MapProtection protection,
MapAccess access, long maxSize, String name)
{
IntPtr hFile = IntPtr.Zero;
try
{
Look for already open MMF object by its unique name. If the return handle is
null, we need to create the MMF object:
m_hMap = Win32MapApis.OpenFileMapping((int)access,false,name);
if (m_hMap == NULL_HANDLE )
{
int desiredAccess = GENERIC_READ;
if ( (protection == MapProtection.PageReadWrite) ||
(protection == MapProtection.PageWriteCopy) )
{
desiredAccess |= GENERIC_WRITE;
}
First, we will try to open the back file using the given parameters. If we
succeed, we will use the file handle to create the MMF object. If the handle of
the MMF object is null, we will throw an exception. We use the Marshal class to
get the last error from Win32:
hFile = Win32MapApis.CreateFile (
GetMMFDir() + fileName, desiredAccess, 0,
IntPtr.Zero, OPEN_ALWAYS, 0, IntPtr.Zero);
if (hFile != NULL_HANDLE)
{
m_hMap = Win32MapApis.CreateFileMapping (
hFile, IntPtr.Zero, (int)protection,
0,(int)(maxSize & 0xFFFFFFFF), name );
if (m_hMap != NULL_HANDLE)
m_maxSize = maxSize;
else
throw new FileMapIOException
( Marshal.GetHRForLastWin32Error() );
}
else
throw new FileMapIOException
( Marshal.GetHRForLastWin32Error() );
}
}
catch (Exception Err)
{
throw Err;
}
finally
{
If the handle of file is in use, we need to free it:
if ( (hFile != NULL_HANDLE) && (hFile != INVALID_HANDLE_VALUE) )
Win32MapApis.CloseHandle(hFile);
}
}
While creating the physical file, use GetMMFDir function that returns the driver
to create the file. Eventually, we deploy applications to integration, test,
and production servers. These servers possess drivers to separate the system
files from the application files. On these servers, for example, drive E is for
application files, and drive C holds just system files. On the development
machine, drive C holds the application and the system data. This simply returns
the C drive, but you can read the drive from the registry, initialize file or
Config file.
The other approach is to create an empty MMF class and then use the Create
function to create a file and a memory map file. The Create function is always
used to create a new file and a memory map file. This function is used to make
the class more general, but is not used in the code.
The Open function accomplishes two tasks. The first is obtaining a handle to an
already open MMF object by its name. The second is to return an indication if a
MMF with such a name is already open:
public bool Open ( MapAccess access, String name )
{
bool RV = true;
try
{
m_hMap = Win32MapApis.OpenFileMapping ( (int)access, false, name );
if ( m_hMap == NULL_HANDLE )
RV=false;
return RV;
}
catch
{
return RV;
}
}
In this solution the most used function to get the MMF object handler is OpenEx.
This function also uses the OpenFileMapping function to open an already created
MMF object. However, if the OpenFileMapping returns an invalid handle, we try
another way to open the MMF object. Our solution is based on mapping files to
memory. This means that for every MMF object, there is a physical file. As
mentioned previously, this file is updated with the data that is written to the
memory by the operating system. We can use this file with the data in it by
attempting to open the file and getting its handle. We can then use this handle
as one of the parameters to open the MMF object. If we succeed, the file with
the data is mapped to memory and the return value of the function will be true.
If we fail to open the MMF object, the function will return false. The main
difference between this function and previously mentioned functions is that,
this function attempts to open the MMF object from an existing physical file,
while others try to open the map file and if this fails, they create a new
physical file:
public bool OpenEx (int size,string FileName, MapProtection protection,
string name,MapAccess access)
{
bool RV = false;
IntPtr hFile = INVALID_HANDLE_VALUE;
try
{
Attempt to open an already created MMF object. If none exists, check if a
backing file exists. If this is present, we obtain its size and the handle of
the file by using OpenFile. We then use the file handle to create a new MMF
object:
m_hMap = Win32MapApis.OpenFileMapping ( (int)access, true, name );
if ( m_hMap == NULL_HANDLE)
{
Check if a backed physical file exists on disk:
if ( System.IO.File.Exists (GetMMFDir() + FileName) )
{
long maxSize = size;
OFSTRUCT ipStruct = new OFSTRUCT ();
string MMFName = GetMMFDir() + FileName;
Open the physical file:
hFile = Win32MapApis.OpenFile (MMFName, ipStruct ,2);
// determine file access needed
// we'll always need generic read access
int desiredAccess = GENERIC_READ;
if ( (protection == MapProtection.PageReadWrite) ||
(protection == MapProtection.PageWriteCopy) )
{
desiredAccess |= GENERIC_WRITE;
}
// open or create the file
// if it doesn't exist, it is created
Create a file-mapping object:
m_hMap = Win32MapApis.CreateFileMapping (
hFile, IntPtr.Zero, (int)protection,
(int)((maxSize >> 32) & 0xFFFFFFFF),
(int)(maxSize & 0xFFFFFFFF), name );
RV = true;
}
else
RV = false;
}
else
RV = true;
return RV;
}
catch
{
return false;
}
Finally, close down the physical file handle:
finally
{
if ( (hFile != NULL_HANDLE) && (hFile != INVALID_HANDLE_VALUE) )
Win32MapApis.CloseHandle(hFile);
}
}
While writing the OpenFileEx function, we use the new Win32 API function
OpenFile to open an existing file. Adding the function P/Invoke code to our
WIN32MapApis class is required to use this function in managed code. We will
use this new API declaration and demonstrate how to declare API structures. The
Win32 OpenFile function needs to receive a pointer to a structure that holds
return information about the open file:
HFILE OpenFile(
LPCSTR lpFileName, // file name
LPOFSTRUCT lpReOpenBuff, // file information
UINT uStyle // action and attributes
);
typedef struct _OFSTRUCT {
BYTE cBytes;
BYTE fFixedDisk;
WORD nErrCode;
WORD Reserved1;
WORD Reserved2;
CHAR szPathName[OFS_MAXPATHNAME];
} OFSTRUCT, *POFSTRUCT;
The structure holds a new concern. One of its members is an array of chars that
require attention. To use the structure in managed code, we will create new
class that will represent all the data existing in the Win32 structure:
[StructLayout (LayoutKind.Sequential )]
public class OFSTRUCT
{
public const int OFS_MAXPATHNAME = 128;
public byte cBytes;
public byte fFixedDisc;
public UInt16 nErrCode;
public UInt16 Reserved1;
public UInt16 Reserved2;
[MarshalAs (UnmanagedType.ByValTStr,SizeConst=OFS_MAXPATHNAME)]
public string szPathName;
}
In this class declaration, we use the StructLayout attribute with the
Sequential, to indicate the CLR to order the fields in the memory by their
declaration order. The MarshalAs attribute of the string szPathName member
tells the CLR how to marshal the types to the unmanaged area. The ByValTStr
parameter of MarshalAs indicates a fixed string that appears inside the
structure, and the SizeConst holds the Const that sets the string size. This
way we can marshal a fixed array of chars to the un-managed code.
There are two new issues regarding P/Invoke in the OpenFile declaration.
Firstly, we change the CharSet attribute to ANSI. The API function obtains the
file name in ASCII code but the CLR is using Unicode. The CharSet converts the
managed Unicode to unmanaged ANSI. Secondly, we use the MarshalAs attribute to
tell the CLR to marshal the class that resembles the structure as long pointer
to structure.
[ DllImport("kernel32", SetLastError=true, CharSet=CharSet.Ansi ) ]
public static extern IntPtr OpenFile (String lpFileName,
[Out,MarshalAs (UnmanagedType.LPStruct )]
OFSTRUCT lpReOpenBuff,
int uStyle);
After getting the handle of the MMF object, the next step is to map view of the
MMF object. The MapView function is responsible for this task. By using the
MapViewOfFile function, we will obtain the base address of the memory in the
hosted process, where the view of the MMF object began. However, in the managed
code, we can’t read and write from unmanaged heap. To workaround this problem,
we will create a new MemoryStream class, that by using the Marshal class will
read and write from unmanaged code. If the MMF object contains data, we need to
copy this data from the unmanaged code to our new MemoryStream object. For this
task we use the Copy function of the Marshalclass. This function copies
specific numbers of bytes from the unmanaged heap, starting from given address,
to a managed byte array. We will use this byte array in the stream class, to be
created in the next step:
public MapViewStream MapView ( MapAccess access, long offset, int size,
string path )
{
IntPtr baseAddress = IntPtr.Zero;
bool iSWritable=true;
MapProtection protection=MapProtection.PageReadOnly;
try
{
Use the WIN32 function to obtain the base address in the hosted process of the
map object viewing:
baseAddress = Win32MapApis.MapViewOfFile (
m_hMap, (int)access,
(int)((offset >> 32) & 0xFFFFFFFF),
(int)(offset & 0xFFFFFFFF), 0 );
if ( baseAddress != IntPtr.Zero )
{
if ( access == MapAccess.FileMapRead )
protection = MapProtection.PageReadOnly;
else
protection = MapProtection.PageReadWrite;
m_maxSize = size;
Copy the bytes from the unmanaged memory heap to the byte array:
byte[] bytes = new byte[m_maxSize];
Marshal.Copy(baseAddress,bytes,0,(int)m_maxSize);
if ( access == MapAccess.FileMapRead )
iSWritable = false;
else
iSWritable = true;
Return a new instance of our implementation of the MemoryStream by sending the
base address and the byte array:
return new MapViewStream(baseAddress, bytes,iSWritable);
}
return null;
}
catch
{
throw new FileMapIOException ( Marshal.GetHRForLastWin32Error() );
}
}
The MemoryMappedFile class also contains functions to close the MMF object by
using the API function CloseHandle and dispose off the object (freeing the MMF
object handle). This class supplies all the functionality necessary to create,
open, and map view of the MMF object. The MapView function returns an instance
of MapViewStream class. This class will handle all the required functions to
read and write data from unmanaged memory. We will go through this class in the
next section.
Reading and writing the data
This part of the task is the most important, complicated, and interesting. I’ll
try to walk through it as clearly as I can. Until this point of our task, we
managed to use the MMF WIN32 API from C# to get the base address of the MMF
mapping in the hosted process of our DLL. From this point on, we need to move
data between the memory area, which holds the map data, and the managed code.
This is not an easy task because of the managed code notion. While working in
managed code, the CLR is responsible for handling the memory. This fact
prevents us from reaching the memory directly. The base address of the mapped
MMF demands direct access to memory, in order to read and write data from it.
There are solutions for this problem using C#. The first and best known is
unsafe code. We can write blocks of unsafe code into our solution. From these
blocks, we can manipulate the memory directly. The second option is to use the
Marshal class. This class provides a collection of methods for allocating
unmanaged memory, copying unmanaged memory blocks, and converting managed to
unmanaged types. We have already used the Copy function of the Marshal class in
the MapView function, to copy data from unmanaged code to managed byte array.
At this point, assemble all the puzzle parts that we have already put together.
Using MMF, we can share data between processes. MMF gives us a base address of
the area in the memory that the data has mapped in our process. To manipulate
this data, we will use the Marshal class. There is just one piece that remains
unclear - how to write the objects to cache? We need to find a way to move the
data from managed object to unmanaged memory heap. At first sight, it looks
that using serialization is the easiest way to do it. We can serialize the
private and public members of a class into the MemoryStream object. The
MemoryStream class implements reading and writing data from it. We can overload
the Read and Write functions of the MemoryStream class so that, reading and
writing will be from unmanaged heap, using the Marshal class.
I tried the above approach. I wrote my own streamclass that inherits from the
abstract Stream class. I implemented the read and write functions by using the
Marshal.Copy. I checked the performance of this class and found that it was
really unsatisfactory. So, I tried another approach. Instead of using
serialization, I wrote the objects as they were to the unmanaged memory (I
performed the test with strings). There was a significant improvement in the
performance. The problem with this approach is that most of the CLR objects do
not tell us their length in the memory, nor does the CLR supply a function that
returns this information. The length of object in bytes is required for the
Copy method. We will use this information while building our MemoryStreamclass.
Before writing our MemoryStream class, let us examine some of the options that
the Marshal supplies:
-
ReadByte allows us to read one byte at a time from the unmanaged memory. To
read the byte, we need to set the memory address from which to read. We can set
the offset in bytes from the given address to read the byte. We can use this
function to read byte by byte from an unmanaged heap into a managed array.
-
ReadInt16(32,64) allows us to read (16,32,64) bit Integers at a time from the
unmanaged memory. To read the Integer, we need to set the memory address from
which to read. We can set the offset in bytes, of the read Integer from a given
address. We can use this function to read integer from unmanaged heap.
-
WriteByte allows us to write one byte at a time to the unmanaged memory. To
write the byte, we need to set the memory address at which to write. We can set
the write integer offset in bytes from the given address. We can use this
function to write byte by byte from stream to unmanaged heap.
-
WriteInt16(32,64) allows us to write (16,32,64) bit Integer at a time to
unmanaged memory. To write the Integer, we need to set the memory address to
which to write. We can set the offset of the written integer in bytes from a
given address. We can use this function to write integer to unmanaged heap.
-
StringToHGlobalUni(Ansi,auto) allows us to copy string (Unicode or ANSI format)
from managed to unmanaged heap. This function allocates the memory space in the
heap, copies the string and returns the address. However, we are unable to use
this function due to the fact that we cannot set the address of the copy
string. If we are unable to set the copy location to the base address of the
MMF view, we cannot use this value.
-
PtrToStringUni(Ansi,Auto) copies string (Unicode or ANSI format) from a given
address to managed string. We can use this function to copy strings from
unmanaged heap.
-
Copy is the most practical method. It can copy arrays of bytes, char, double,
short, int, and long between managed and unmanaged heap. To perform the copy
function, a memory address to start the operation from is needed. The array,
index to set the array element from which the operation starts, and the length
of the copy bytes are also needed. We can use this function to copy any data
between managed and unmanaged as long as we know the size of the copy objects
in bytes.
After we become familiar with the Marshal functions and the serialization
limitation we will create a MemoryStream object that can supply the needed
functionality using serialization, and by writing data directly to unmanaged
heap without serialization. As mentioned earlier, we do not know the size of
the object. In this situation we will use serialization to get a stream with
the known size (this is part of the built-in functionality of the formatter).
If we know the object size, which implies strings integer types or array
(except string), we write the data to the unmanaged heap without using
serialization. The new class will be named as MapViewStream.
Most of the functionality that exists in the CLR MemoryStream meets our needs,
so by inheriting from it we save writing code. As private data we will save the
base address returned from the MapView function. We will use this data in
almost any function that we will write:
public class MapViewStream : MemoryStream //, IDisposable
{
private IntPtr m_baseaddress = IntPtr.Zero;
This object will be created when the user uses the MapView function to return a
MemoryStream that can transfer data across unmanaged code. To work with such a
stream, we need the base address to be returned by the MapViewOfFile function
used in MapView function, and byte array that holds the MemoryStreamdata. We
obtain the byte array that holds the data existing in the mapped area by
copying the data from the unmanaged code to the byte array. The last parameter
is an indication of whether the stream is writable. Inside the constructor we
call the base class with the byte array and the writable indication. The base
address is stored in the private member of the class:
public MapViewStream(IntPtr baseaddress, byte[] bytes,bool iSWritable) :
base(bytes,iSWritable)
{
m_baseaddress = baseaddress;
}
After storing the data and creating the base class we do not need to overload
the regular Readfunction. We already have the byte array in the stream, so we
do not need to do anything special to read it. The regular Read is used by
de-serialization of type. For writing the serialization bytes, we need to
create a new Write function. This function gets the byte array of the stream
with the stream length as parameters. Inside this function we use the
Marshal.Copyfunction to write the byte array to the unmanaged heap:
public void Write ( byte[] buffer, int count )
{
try
{
Marshal.Copy(buffer,0,m_baseaddress,count);
}
catch(Exception Err)
{
throw Err;
}
}
We will add a Write function that receives the string as a parameter. This
function writes a string to the unmanaged heap. To copy the string, we will use
the Marshal.Copy method by sending it an array of chars. We are able to write a
string without using serialization because the string holds its length in
bytes:
public void Write (string str)
{
try
{
Marshal.Copy(str.ToCharArray (),0,m_baseaddress,str.Length);
}
catch(Exception Err)
{
throw Err;
}
}
The Read function without any parameters, is added to the read string from the
unmanaged heap to managed string. Every string ends with the null terminated.
The PtrToStringUni looks for this to know that the string ends. We have to
supply to the PtrToStringUni function, the address where to start to looks for
the string terminator. This is the base address that we supply in the Write
function (the map view base address). With the starting address of the string,
the function collect bytes until null terminate is reached and the collected
bytes returns as string:
public string Read ()
{
try
{
return Marshal.PtrToStringUni(m_baseaddress);
}
catch(Exception Err)
{
throw Err;
}
}
The Flush function is used as a way to reflect base media with the changes made
in the stream. In our case, we need to use the FlushViewOfFile with the base
address as a parameter to reflect the changes we made in the mapped memory into
the file we mapped:
public override void Flush()
{
base.Flush();
Win32MapApis.FlushViewOfFile(m_baseaddress,0);
}
When we close our MemoryStream, we also want to ensure that the change will be
reflected. We achieve this by calling the Flush function. After the
MemoryStream is closed, we cannot read and write to the view of the map file,
therefore it is better to un-map the view of the MMF object:
public override void Close()
{
Flush();
base.Close();
Win32MapApis.UnmapViewOfFile(m_baseaddress);
}
}
In this section we built a class that inherits from MemoryStream, reads and
writes from unmanaged code and allows us to manipulate the data. This class is
the key in this solution. We use it to read and write from MMF views on the one
side, and to serialize/de-serialize objects or read/write objects directly to
the MMF views on the other. In the next section we will examine how we
implement the logic of this solution.
Caching the data
Our goal here is to share data between processes. Because of the leak of the
object size information in the CLR, most of the object that we will cache will
use serialization. Basically, we can serialize the object into a file and
synchronize the access to the file. This way, every process can grab the object
by using de-serialization. But the problem with this solution is that it is
slow because of the need to make an IO operation every time we want to
de-serialize an object from file, and because of the thread synchronization (to
access the resource). Alternatively, using MMF we can be more efficient. We
need to create the MMF object only once. Every process is then mapping the MMF
object to a space in its memory. This way we can communicate between processes
by using the name of the MMF object and improving the speed since reading a
value its merely reading data from memory instead of from a file.
In most of the cases, we will serialize\de-serialize object into\from file and
then by using MMF we will access the file, as we access memory. The good news
is that the serialization functionality is largely built into the CLR, so we
are able to just use it. There is just one exception. If we want to serialize
our types, we need to implement Iserializable. The implementation will force us
to create a special constructor to read public and private data (de-serialize)
and then employ the GetObjectDatafunction to write the data (serialize).
Another way to accomplish this, which is perhaps a more simple solution, is to
use the serializable attribute. This attribute indicates that the class is
serialized; therefore, every member in the type (private or public) will be
serialized by the CLR. The difference between these two approaches is that the
first one is more flexible and gives us more control on the serialization
process. (It can be used for example, to serialize the data into a long string
that can be written directly to memory). Bear in mind that if someone wants to
use your machinery to store his types, he will to take care of serialization.
We will see how to use the serialization shortly.
In this solution we are maintaining the cache. This means that we need to keep
a lot of objects in our machinery. We need to watch the cached objects, so that
if we need to get a certain object from the cache, we will know "who" is the
object, what is its length, and where it is stored. The simple way to know
"who" is the object is to give it a name, which can be followed by the object
length. This way, every process can obtain the same object from the cache by
using its name. The hash table object looks to be the most suitable object for
storing and retrieving data quickly, but using this object will cause
performance problems. The hash table will be used any time the machinery will
be called and changed (add, get, change, and remove object). Every time we need
to change the hash table, we need to use serialization because we cannot know
its length. You may remember that serialization is more time consuming then
handling the memory directly. Therefore, to be more dynamic, we will save the
cached objects and manage data in strings, which we can write\read directly
from unmanaged code.
Another issue that we need to consider is where to keep the data that we cache.
Basically, there are two approaches to the issue. The first is to keep all the
data in the same MMF object (file). This approach requires us to know the
dynamic size of the managed cached data string, so that we can extract the
string, and to know where the request object is located and what is its length.
In this case we will save the address offset of the cached object in the
memory. Besides the cached object offset, we need to keep the length of the
managed cached data string in the first 4 bytes. With this approach, we will
have just one file that we will map. Using the other approach, we will have a
file for every object. We will serialize or write objects directly to memory
and reflect them in the file. In this way, we just need to keep the name of the
file mapped for every object. The advantage here is that we don’t need to keep
the size of the managed cache data string and the offset of every object, as
every object will be allocated a special file to hold its data. Furthermore,
this approach allows us to keep data of the object in unique files that can be
preserved following computer shut down. Using files its much easier to maintain
data that rapidly changes its size. The second alternative has been selected
mainly for its simplicity to store rapidly changing objects. To simplify, we
will name every back file with the object store name so we can know the object
location by its name.
To maintain all the backed files that we create in the same place, set special
folder MMFfiles to save those files. Every file in this folder will have the
name the user gave to the object plus a .nat extension. The cached object
manage data string will be named always as ObjectNamesMMF.
The DevCache class intention is to encapsulate the cache machinery logic, and
to give the end user a simple and intuitive interface to work with. We will
allow the user cache objects, but if we discover that the requested cache
object is a string, we will cache it without using serialization. This approach
will increase performance; therefore, the DevCache interface consists of three
functions:
-
AddObject is responsible for adding new objects to cache or for updating the
content of existing objects. The function will add or update entries in the
cache, manage strings, and create or update file and MMF objects.
-
GetObject will look for the object name in the cache manages string, and
de-serialize or obtain the object directly from the MMF object and return it.
-
RemoveObject removes objects by removing their names from the cache manage
string, closing their MMF object, and deleting the file.
Beside the public function, there are some private functions that are
responsible for special tasks in the overall process. We walk through these
while working on the class.
DevCache holds four private members: m_StringMMFs holds all the objects that
were added without serialization. We need this list to know which to get them
from memory when the user calls for them. oMutex is an instance of named Mutex.
We will use the Mutex as access synchronization between threads from different
processes that will try to access the share memory at the same time. oStringMMF
holds the list of the un-serialized objects so that we can preserve this data
across processes and the computer shutdown. ObjectNamesMMF is a const that
holds the name of the Chase manage data MMF object:
public class DevCache
{
int m_StringMMFs="";
private System.Threading.Mutex oMutex =
new System.Threading.Mutex(false,"MmfUpdater");
MemoryMappedFile oStringMMF = new MemoryMappedFile();
private const string ObjectNamesMMF = "ObjectNamesMMF";
Writing to MMF
In this section we will cover the private function that actually writes the
cached objects into the MMF memory. We support writing object via serialization
and without using serialization. The WriteString2MMF writes strings to the MMF
memory. As will be seen after checking the type of the request cache object, we
call on this function if we find that the object is a string. This function is
always used to write the cache object manage data string:
private int WriteString2MMF(string InObject, string obectName)
{
MemoryMappedFile map = new MemoryMappedFile();
Set the size variant to the string object length. We will use this to open MMF
object and map view of MMF:
int iSize = InObject.Length;
oMutex.WaitOne ();
We use the OpenEx to discover if there is an open MMF object or file holding the
MMF data that can be used to open MMF object. I f we failed to open MMF object
we create new one:
if (!map.OpenEx (iSize,obectName + ".nat",MapProtection.PageReadWrite,
obectName,MapAccess.FileMapAllAccess))
map = new MemoryMappedFile (obectName + ".nat",
MapProtection.PageReadWrite,
MapAccess.FileMapAllAccess,iSize,obectName);
Calling MapView to get the MapViewStream object:
MapViewStream stream = map.MapView(MapAccess.FileMapAllAccess, 0,
(int)iSize,obectName + ".nat" );
Passing a string to the MapViewStream.Write to write the string to the memory:
stream.Write(InObject);
stream.Close();
oMutex.ReleaseMutex();
return iSize;
}
The WriteObjectToMMF receives the cache object, its name and its size. With this
parameter, the function attempts to write to the MMF memory with the assistant
of serialization. Before using serialization, the function checks the InObject
parameter type. If the InObject is a string, we call the WriteString2MMF
function in order to enhance the performance:
private int WriteObjectToMMF(object InObject, string obectName,int ObjectSize)
{
Check the InObject type to see if it’s a string:
if (InObject.GetType() == typeof(String) )
{
Add the objectName to the MMF that holds all the objects added to cache, without
using serialization. Then write the string to the MMF memory area:
this.StringMMFs = obectName;
return WriteString2MMF(InObject.ToString(), obectName);
}
MemoryMappedFile map = new MemoryMappedFile();
MemoryStream ms = new MemoryStream ();
BinaryFormatter bf= new BinaryFormatter();
int iSize = 0;
Use binary formatter to serialize the object to stream and obtain is size:
bf.Serialize (ms,InObject);
iSize = (int)ms.GetBuffer().Length;
oMutex.WaitOne ();
Open MMF object with the object name:
if (!map.OpenEx (iSize,obectName + ".nat",MapProtection.PageReadWrite,
obectName,MapAccess.FileMapAllAccess))
map = new MemoryMappedFile (obectName + ".nat",
MapProtection.PageReadWrite,MapAccess.FileMapAllAccess,
iSize,obectName);
Get MapviewStrem object from the view of the MMF and send it to the stream byte
array to be written to the memory:
MapViewStream stream = map.MapView(MapAccess.FileMapAllAccess, 0,
(int)iSize,obectName + ".nat" );
stream.Write(ms.GetBuffer(),iSize);
Update the MMF object and un-map the view of MMF object:
stream.Close();
oMutex.ReleaseMutex();
return iSize;
}
The StringMMFs property can obtain and set a list of object names stored in the
cache without using serialization. We need this list to call the object from
the cache. If we try to retrieve an object that was stored without
serialization using de-serialization, we will get an error. To retrieve the
string from the MMF object, we try to open MMF. If we fail, an empty string
will be returned. If we succeed, we use the MapViewStream.Read to get the
string from the MMF memory:
if (!oStringMMF.OpenEx(4,"stringMmf.nat", MapProtection.PageReadWrite,
"StringMmf",MapAccess.FileMapAllAccess))
return "";
MapViewStream stream = oStringMMF.MapView(MapAccess.FileMapAllAccess,0,4,
"stringMmf.nat");
string str = stream.Read();
stream.Close();
return str;
Setting a value in the string is more complicated. While setting the string, we
want to add new objects names if they do not exist in the string, and we want
to remove objects from the list in case the user replaces the object type from
the string of any object. If we cannot open the MMF object, we create a new one
using the existing m_StringMMFs string (after adding the new value) to set its
size and the map view size:
if(m_StringMMFs.IndexOf (value) == -1 || value == "")
{
m_StringMMFs += value;
if (!oStringMMF.OpenEx(m_StringMMFs.Length ,"stringMmf.nat",
MapProtection.PageReadWrite,"StringMmf",
MapAccess.FileMapAllAccess))
oStringMMF = new MemoryMappedFile( "stringMmf.nat",
MapProtection.PageReadWrite,
MapAccess.FileMapAllAccess,m_StringMMFs.Length,
"StringMmf");
MapViewStream stream = oStringMMF.MapView(MapAccess.FileMapAllAccess,
0,m_StringMMFs.Length, "stringMmf.nat");
If the last object name in the string is removed we need to clear the last
object name:
if (m_StringMMFs == "")
stream.Write(" ");
else
stream.Write(m_StringMMFs & "*");
stream.Close();
}
AddObject
Adding an object is a complicated task. There are some scenarios where we can
achieve this function:
-
The first time that we create the cache objects managed string and the given
object.
-
A cache objects managed string already exists and we want to add a new object.
In this situation, we need to add a new entry in the cache objects managed
string and create a file and MMF object to the given object
-
The requested cache object already exists in the cache object managed string,
but the user requests to replace the cached object MMF with the new data.
To check if we access the mechanism on the first attempt, we will use the
OpenEx function of the MemoryMapFile class. This method will try to open the
MMF object by the request object name. If the open fails, the function looks
for a physical file that holds the data as a request object MMF. If the
physical file finds the function, it creates a new MMF object based on the
physical file data. If physical file was not found, the function return false
and we know that this is the first time that this object has been added. When
the request object name is the name of the cache objects managed data string,
we know that this is the first time that the request has reached the machinery.
When reaching the mechanism for the first time, we will use the
WriteObjectToMMF/ WriteString2MMF function to create a physical file, the MMF
objects, and to load them with the cache object data. First, we handle the
request object, then using its size and name, we handle the cache objects
manages data.
When a cache objects manage string MMF exists, we need to call the string from
the memory and check if the request object name exists in the string. If the
object name does not exist, we will use the WriteObjectToMMF/ WriteString2MMF
function to create a physical file and MMF object for the given cache object,
then load them with the cache object data. Following this, we need to add the
object name and size to the cache objects managed data string and reflect the
cache objects manage data string MMF object with the changes.
If the request cache object name exists in the cache objects manage data
string, all we need to do is just update the MMF object of the given object
with the new value:
public void AddObject(string objName, object inObject, bool UpdateDomain)
{
MemoryMappedFile map = new MemoryMappedFile();
Create string builder that holds the cache object manage data string:
System.Text.StringBuilder oFilesMap= new System.Text.StringBuilder()
int iSize = 0;
oMutex.WaitOne ();
try
{
Check if a MMF exists for the cache objects manage data string:
if (! map.OpenEx(0,ObjectNamesMMF + ".nat",MapProtection.PageReadWrite ,
ObjectNamesMMF,MapAccess.FileMapAllAccess))
{
If it does not exist, create the MMF and feed it with the request cache object
data. Add to the cache objects manage data string the new cache object and it
size. Create an MMF for the cache object manage data string, and write the
string content:
//Create MMF for the object and serialize it
iSize = WriteObjectToMMF(inObject,objName,0);
//add object name and mmf name to hash
oFilesMap.Append(objName + "#" + System.Convert.ToString(iSize) +
"@");
//create main MMF
WriteString2MMF(oFilesMap.ToString(),ObjectNamesMMF);
}
else
{
BinaryFormatter bf = new BinaryFormatter();
If cache objects manage data string MMF exists, call up its content:
MapViewStream mmfStream = map.MapView(MapAccess.FileMapAllAccess, 0,
0,ObjectNamesMMF + ".nat");
mmfStream.Position = 0;
oFilesMap.Append (mmfStream.Read());
long StartPosition = mmfStream.Position;
mmfStream.Close ();
Check if the cache objects manage data string contains the request cache object
name:
if (oFilesMap.ToString().IndexOf(objName + "#") > -1 )
{
If the request cache object exists, we need to change its content. While doing
this, we need to check if the new data is a string or an object and to act as
is required:
MemoryMappedFile MemberMap = new MemoryMappedFile();
bf = new BinaryFormatter();
MemoryStream ms = new MemoryStream ();
Check if the request cache object data type is string. If not, we need to use
serialization to get the object size:
if (inObject.GetType() != typeof(String))
bf.Serialize (ms,inObject);
iSize = (int)ms.GetBuffer().Length;
Open the existing request cache object MMF object:
MemberMap.OpenEx(iSize,objName + ".nat",
MapProtection.PageReadWrite,
objName,MapAccess.FileMapAllAccess);
MapViewStream stream = MemberMap.MapView
(MapAccess.FileMapAllAccess, 0,iSize,objName + ".nat");
stream.Position = 0;
Check again for the type. If not a string, we use the serialization byte array
to write the data, and remove the request cache object from the string that
holds not serialized objects. If the type is string, we simply send the string
to be written and add the request cache object to not serialized string:
if (inObject.GetType() != typeof(String))
{
stream.Write(ms.GetBuffer(),iSize);
m_StringMMFs = m_StringMMFs.Replace (objName,"");
StringMMFs = "";
}
else
{
stream.Write (inObject.ToString());
iSize = inObject.ToString().Length;
StringMMFs = objName;
}
stream.Close();
Change and update the new size of the object in the cache objects manage data
string:
string[] str = oFilesMap.ToString().Split('@');
for(int i = 0; i < str.Length; i++)
{
if (str[i].IndexOf (objName) > -1)
{
string strVal = str[i].Substring( str[i].IndexOf('#')+1);
oFilesMap.Replace(str[i],objName + "#" + iSize);
break;
}
}
WriteString2MMF(oFilesMap.ToString() ,ObjectNamesMMF);
}
else
{
If the request cache object name does not exist in the cache objects manage data
string. We create a file and MMF for the new object and load them with the new
object data:
iSize = WriteObjectToMMF(inObject,objName,0);
Then we update the cache object manage data string and its MMF object:
MapViewStream stream = map.MapView (MapAccess.FileMapAllAccess,
0,0,ObjectNamesMMF + ".nat" );
// update the main HashTable
oFilesMap.Append(objName + "#" + System.Convert.ToString(iSize)
+ "@");
// serialize new Hash
stream.Write (oFilesMap.ToString());
stream.Position = 0;
stream.Close();
}
}
}
catch (Exception e)
{
throw new Exception("Cannot Open File "+objName,e);
}
finally
{
oMutex.ReleaseMutex ();
}
}
GetObject
Getting an object from the cache is fairly simple. We get the object name from
the cache objects manage data string. If the object name exists we can open the
MMF object that resembles the object, de-serialize or read the object from the
MMF and return it to the caller:
public object GetObject(string objName)
{
MemoryMappedFile map = new MemoryMappedFile();
MemoryMappedFile mapOfName = new MemoryMappedFile();
string oFilesMap = "";
try
{
oMutex.WaitOne ();
Check if the cache objects manage data string exists. If not return null:
if (! map.OpenEx (0,ObjectNamesMMF + ".NAT",MapProtection.PageReadWrite
,ObjectNamesMMF,MapAccess.FileMapAllAccess ))
throw new Exception ("No Desc FileFound");
Get the string from the MMF object:
BinaryFormatter bf = new BinaryFormatter();
MapViewStream mmfStream = map.MapView (MapAccess.FileMapAllAccess,
0, 0,ObjectNamesMMF + ".NAT");
mmfStream.Position = 0;
oFilesMap = mmfStream.Read ();
long StartPosition = mmfStream.Position;
Check if the request name exists. If not return null:
if (oFilesMap.IndexOf(objName + "#") == -1)
throw new Exception ("No Name Found");
string strValSize = "";
Gets the request file size from the cache objects manage data string:
string[] str = oFilesMap.Split('@');
for(int i = 0; i < str.Length; i++)
{
if (str[i].IndexOf (objName) > -1)
{
strValSize = str[i].Substring( str[i].IndexOf('#')+1);
break;
}
}
Open the request object MMF object:
if(! mapOfName.OpenEx ( Convert.ToInt32(strValSize), objName +
".NAT",MapProtection.PageReadWrite ,
objName,MapAccess.FileMapAllAccess ))
throw new Exception ("No Name File Found");
mmfStream.Close();
mmfStream = null;
MapViewStream ObjStream = mapOfName.MapView(MapAccess.FileMapAllAccess,
0, Convert.ToInt32(strValSize) ,objName+".NAT");
ObjStream.Position = 0;
object oRV;
If the request object name exists in the not serialize file read the data. If
exists, de-serialize read the object:
if (this.StringMMFs.IndexOf(objName) > -1 )
oRV = ObjStream.Read();
else
oRV = bf.Deserialize(ObjStream) as object;
ObjStream.Close ();
return oRV;
}
catch
{
return null;
}
finally
{
oMutex.ReleaseMutex ();
}
}
RemoveObject
To remove the object, we first need to read the cache objects manage data
string, remove the entry of the object name from the manage string and write
the updated string to the MMF object. Then we use the Open method of
MemoryMapFile object to check if an MMF object of the requested object is
already open. If so, we close the MMF object. Now it remains only to delete the
physical file that holds the object data:
public void RemoveObject(string ObjName)
{
If we succeed to open the cached objects manage data MMF:
MemoryMappedFile map = new MemoryMappedFile();
if ( map.OpenEx(0,ObjectNamesMMF + ".nat",MapProtection.PageReadWrite,
ObjectNamesMMF,MapAccess.FileMapAllAccess))
{
Remove the name of the object from the cached objects manage data string and
from the not serialize object string:
BinaryFormatter bf = new BinaryFormatter();
MapViewStream mmfStream = map.MapView(MapAccess.FileMapAllAccess, 0,
0,"");
mmfStream.Position = 0;
string oFilesMap = mmfStream.Read();
int iEntryStart = oFilesMap.IndexOf(ObjName);
string Entry = oFilesMap.Substring(iEntryStart,
oFilesMap.IndexOf("@",iEntryStart)+1 - iEntryStart);
oFilesMap = oFilesMap.Replace(Entry,"");
mmfStream.Write(oFilesMap);
mmfStream.Flush();
mmfStream.Close();
Delete the map of the object:
MemoryMappedFile oMMf = new MemoryMappedFile ();
if( oMMf.Open(MapAccess.FileMapAllAccess,ObjName))
{
oMMf.Close();
oMMf.Dispose();
}
if (System.IO.File.Exists(map.GetMMFDir() + ObjName + ".nat"))
System.IO.File.Delete(map.GetMMFDir() + ObjName + ".nat");
}
}
Update and lock machinery
In this section we deal with two issues that have a strong connection between
them. We can build a mechanism that will enable us to update another machine in
the domain when we add or update object in the cache. The problem here is that
this issue is an essay in its own right, so I will continue with the second
issue. Here we create synchronization between different threads and processes
that will try to change the same memory area at the same time. We will
accomplish this task by using Mutex. We will create a mutex object with a known
name so all the processes that attach this DLL will use the same mutex as a way
to synchronize access between them.
Now lets see how mutex is integrated into our code. We have seen the
declaration of the mutex class that is part of the System.Threading as private
member of the DevCache class. We provide the constructor two parameters. The
second parameter is the name of the mutex. With this name the first thread that
calls the constructor is going to create the mutex. Another process will get
the handle to the mutex by the name. The first parameter that indicates who
will get the initial ownership of the mutex should be set to false.
private System.Threading.Mutex oMutex = new
System.Threading.Mutex(false,"MmfUpdater");
All the functions that read or write to MMF implement the blocking. All we need
to do is to use the mutex WaitOne method to block threads if another thread
holds the mutex. We must then call ReleaseMutex when we want to free the
blocking. ReleaseMutex will release the mutex and signal other threads that are
waiting that they can now act. We can call the WaitOne with the parameter that
set the timeout period to wait or without a parameter, to wait infinite time:
MemoryStream ms = new MemoryStream ();
BinaryFormatter bf= new BinaryFormatter();
bf.Serialize (ms,InObject);
oMutex.WaitOne ();
MemoryMappedFile map = new MemoryMappedFile(obectName + ".nat",
MapProtection.PageReadWrite,MapAccess.FileMapAllAccess, ms.Length ,
obectName);
MapViewStream stream = map.MapView(MapAccess.FileMapAllAccess, 0,
(int) ms.Length,"" );
stream.Write(ms.GetBuffer() , 0,(int)ms.Length);
stream.Flush();
stream.Close();
oMutex.ReleaseMutex();
The sample application
The sample application demonstrates how to use the cache, and the performance
benefits it achieves. The sample allows you to check words from a list of
English words. If the word does not exist in the list, the application asks for
its spelling. The sample needs to read all words from a file that exists on the
disk. As you may be aware, it might be more efficient if we read from the file
just one time and then every instance of the sample that we activate, read the
data directly from the memory. To do this, we will attempt to open the data
from the cache. If we obtain the data from the cache, we will use it. If not,
we need to read the words from the file, add it to a sorted list, and then add
the sorted list to the cache. The clear button clears the MMF so that the data
will be read from file. The sample also shows the time in milliseconds taken by
each operation.
To activate the sample you need to un-zip the file cacheDemo.zip. Then open the
cacheDemo solution and activate the winApplication (cacheDemo).
Article review
In this article we took a close look at memory map file as a way to share data
between processes. During this study, we examined how to marshal data between
managed and unmanaged code. We built a class that encapsulates the API
function, so we can access it from .NET. To use MMF, we created a class that
enabled us to enjoy all the functionality of MMF. To read and write to/from the
MMF object we created a class that inherited from MemoryStream. This class
enabled us to read and write data between manage and unmanaged memory. The
class can read strings directly or objects via serialization. You can use those
classes in any solution that needs to use MMF. In our solution we decided to
use serialization to store and retrieve object from/to MMF. We do not know the
size of the object. We discovered that using serialization, instead of writing
the object directly harms the machinery performance.
After we created the classes that enable us to read and write from MMF objects,
we built a class that handles the logic of our solution. This class is
responsible for every situation when we add, update get and remove object from
the MMF. This class also checks on the objects that exist in the cache and
their locations. To prevent a situation where threads from processes will
access the MMF at the same time, we used the Mutex class.
This solution is very useful for several tasks. The first and most suitable is
to cache data that is rarely changed, and frequently asked for by applications
on your server or client. With this functionality, we can load lists of data
from the database and then every process can easily and quickly locate the data
and use it. Another possibility for use of this mechanism is to share data
between processes. This scenario is seen in web servers if the web server is
using DLLs that are registered in COM+ as server applications. These DLLs are
running in processes other than the web (dllhost.exe) so there is a problem of
managing state data between them. Using this functionality, the web page can
write data to the MMF and every DLL no matter in which process it is running,
can locate this data.
About Natty Gur
Natty Gur is free consultant specialized in architecture and developing system
of systems using ASP.NET. Natty has 12 years of architecting, designing
developing and distributing software experience, mainly focusing on enterprise
application Involving legacy applications in open systems. Natty available for
short contract offers (complex problems highly appreciate). To contact Natty
e-mail him at natty@dao2com.com or call 972-52-8888377. Read his blog at:
http://weblogs.asp.net/ngur