> >> As others has stated then database is one obvious solution.
>
[quoted text clipped - 6 lines]
>
> Why should a page file be slower than a ny other disk file ?
>>>> As others has stated then database is one obvious solution.
>>>> If you are on 64 bit you could also just use a collection.
[quoted text clipped - 18 lines]
> disk-based collection could keep all the necessary indexing data in a
> single page that never gets swapped out.
"A collection that's designed to run off the disk will probably have an
indexing system"
Sounds as a great idea.
But I get an even better idea. Let us implement that in memory as well.
We could call it Hashtable or Dictionary.
:-)
Arne
>> Why should a page file be slower than a ny other disk file ?
>
> An in-memory collection whose contents are being paged to and from the
> disk by the OS will have worse performance than a collection designed
> to operate off the disk, as soon as you do any kind of search on it.
Why? There's no a priori reason to believe this is true, even though it's
true that _some_ in-memory collections may not be as effecient as a
disk-based database.
> A collection that's designed to run off the disk will probably have an
> indexing system so it doesn't have to load the entire file to find a
> single element. But searching through a massive memory-based
> collection will cause many pages to be swapped in, possibly causing
> other useful pages to be swapped out and lowering performance down the
> road.
You're assuming that the in-memory structure would not have a similar
indexing mechanism.
Now, I don't know the implementation of DataTable. But as a general
concept, there's absolutely no reason it couldn't be indexed in basically
the same way as a database. Conversely, if a database implements (for
example) an index as a simple sorted array that uses a binary search, it's
going to have the exact same liability that an in-memory structure paged
to the disk using the same indexing scheme would have.
> For example, a binary search on a memory-based collection might end up
> having to load half the file into memory, one page at a time, while a
> disk-based collection could keep all the necessary indexing data in a
> single page that never gets swapped out.
If that's a concern, why wouldn't someone just have a similar "index only"
data section for their data structure for the in-memory implementation?
It seems to me that if all you know is that one implementation is a
disk-based database and another is an in-memory data structure, that that
is not nearly enough information to tell you which will perform better.
Pete
Jesse McGrew - 20 Jan 2008 10:00 GMT
On Jan 17, 7:48 pm, "Peter Duniho" <NpOeStPe...@nnowslpianmk.com>
wrote:
> >> Why should a page file be slower than a ny other disk file ?
>
[quoted text clipped - 5 lines]
> true that _some_ in-memory collections may not be as effecient as a
> disk-based database.
Yes, it's possible to write an in-memory collection that will perform
as well when paged to disk as a disk-based collection. But the
specific collection types that have been mentioned here don't fit the
bill, nor do any of the other standard in-memory collections (AFAIK).
Jesse
Peter Duniho - 20 Jan 2008 18:19 GMT
> Yes, it's possible to write an in-memory collection that will perform
> as well when paged to disk as a disk-based collection. But the
> specific collection types that have been mentioned here don't fit the
> bill, nor do any of the other standard in-memory collections (AFAIK).
I don't know about that. They may not be optimized for swapping behavior,
but any of the indexed collections are implemented with an index that is
separate from the data collection itself and finding an element requires
only iterating in some way through the index, not the entire data
collection. In the same way that a file-based, index-based collection
only needs to pull in data from an index file, rather than the entire
database itself, so too does a Dictionary<> (for example) only need to
pull in data from the hashed index, rather than the entire collection in
order to find a specific element.
Is there something you see as being particularly different about the two
scenarios? I'm not seeing it myself. I could imagine that there are
subtle differences in performance, but I don't see anything fundamentally
different about them.
Pete
Arne Vajhøj - 21 Jan 2008 01:18 GMT
> On Jan 17, 7:48 pm, "Peter Duniho" <NpOeStPe...@nnowslpianmk.com>
> wrote:
[quoted text clipped - 10 lines]
> specific collection types that have been mentioned here don't fit the
> bill, nor do any of the other standard in-memory collections (AFAIK).
It is not obvious to me why Hashtable/Dictionary<> should not
have nice O(1) characteristics for number of pages needed
to be read from disk.
Can you elaborate a bit ?
Arne