See inline.
> Scalability implies lots of web servers all referring back to a central
> SQL server, which in turn implies limited caching which in turn hurts
> performance opportunities.
It most certainly does not. Caching avoids the SQL bottleneck.
> I would like to use typed datasets for all the benefits they have, and I
> would like to use timestamps to assist in concurrent edit checking.
timestamps won't help you with concurrency because the timestamp isn't
guaranteed accurate since windows is not a real time OS.
Even a minor lag will thru off your sync on a heavy traffic day.
> I would like to cache the datasets in the asp.net application.
Nope, cache is a poor choice because it is per process. You have a multi-cpu
architecture on a web farm. That leads to cache duplication.
> I would cache data key in a client side cookie.
What happens if cookies are lost, unreadable or client turns them off?
> multiple servers and CPUs). *In my view getting into cache
> synchronisation across web servers will hurt the very performance gains we
> are trying to get via caching in the first place.*
Yes.
> As a user becomes interested in a specific set of data (house) the datakey
> cookie would be set, and this would drive the selection of web process
> that is best suited to serve the request.
Yes, but this is all driven by the client. Not a particularly good choice
since the client doesn't have to follow the rules you impose; that is, a
client can most easily disable cookies.
I *think* I can give each web process a
> distinct url by using application pool configuration, but I haven't
> confirmed this yet.
That doesn't solve your cache affinity problem.
> different web process, however I believe the DOS risk is sufficiently
> small that this design is still widely applicable.
You are pushing a cookie to the client, the wrong client can regenerate
multiple cookies that in turn drive the caching mechanism in your
architecture right?
Then, it's easy to flood the cache architecture from the client since every
request is valid.

Signature
Regards,
Alvin Bruney
------------------------------------------------------
Shameless author plug
Excel Services for .NET is coming...
OWC Black book on Amazon and
www.lulu.com/owc
> Hello all,
>
[quoted text clipped - 87 lines]
>
> Martin
Martin - 21 Jan 2007 11:40 GMT
Hello Alvin,
Are you disagreeing with the whole philosophy of using the cache to help
serve the request as close to the client as possible?
I appreciate this brings challenges in a webfarm environment, and that's
what I'm wanting to address.
If you have good web references to how you would approach the overall goal
of increased performance with caching in webfarms, that would be
interesting.
I've made individual comments inline.
Thanks
Martin
> See inline.
>
>> Scalability implies lots of web servers all referring back to a central
>> SQL server, which in turn implies limited caching which in turn hurts
>> performance opportunities.
> It most certainly does not. Caching avoids the SQL bottleneck.
The point I'm making here is that in a web farm environment, the standard
practice is to reference back to central db server *not* to use caching.
Using caching introduces new challenges which I'm trying to address.
>> I would like to use typed datasets for all the benefits they have, and I
>> would like to use timestamps to assist in concurrent edit checking.
> timestamps won't help you with concurrency because the timestamp isn't
> guaranteed accurate since windows is not a real time OS.
> Even a minor lag will thru off your sync on a heavy traffic day.
Here is a quote from http://support.microsoft.com/kb/170380
"TimeStamp is a SQL Server data type that is automatically updated every
time a row is inserted or updated. Values in TimeStamp columns are not
datetime data; they are, by default, defined as binary(8) varbinary(8),
indicating the sequence of Microsoft SQL Server activity on the row. A table
can have only one TimeStamp column. The TimeStamp data type is simply a
monotonically-increasing counter whose values will always be unique within a
database.
"
What's wrong with that?
>> I would like to cache the datasets in the asp.net application.
> Nope, cache is a poor choice because it is per process. You have a
> multi-cpu architecture on a web farm. That leads to cache duplication.
I want to address this application pool configuration, but not sure if I
can.
What would you do?
>> I would cache data key in a client side cookie.
> What happens if cookies are lost, unreadable or client turns them off?
If they're turned off they could be put in the url (by an http module)
If they are lost or unreadable that would cause interference with the users
browsing experience.
>> multiple servers and CPUs). *In my view getting into cache
>> synchronisation across web servers will hurt the very performance gains
[quoted text clipped - 12 lines]
>> confirmed this yet.
> That doesn't solve your cache affinity problem.
Doesn't it? I've not tried yet.
Got any ideaas then?
>> different web process, however I believe the DOS risk is sufficiently
>> small that this design is still widely applicable.
[quoted text clipped - 3 lines]
> Then, it's easy to flood the cache architecture from the client since
> every request is valid.
I agree
What's your DOS answer?
>> Hello all,
>>
[quoted text clipped - 87 lines]
>>
>> Martin
Alvin Bruney [MVP] - 26 Jan 2007 00:56 GMT
> Are you disagreeing with the whole philosophy of using the cache to help
> serve the request as close to the client as possible?
In principle, yes because it causes more problems than it solves especially
for web farms. It is workable, but at what cost?
> If you have good web references to how you would approach the overall goal
> of increased performance with caching in webfarms, that would be
> interesting.
Actually, the patterns and practice group at MS has released the
authoritative work on that. I just happen to subscribe to what it preaches.
> The point I'm making here is that in a web farm environment, the standard
> practice is to reference back to central db server *not* to use caching.
It may be standard practice, but it is dead wrong with respect to
scalability.
> What's wrong with that?
In even a moderate concurrent environment, by the time you read the data it
may have already changed because of another update making your published
value stale.
> What would you do?
For a web farm, that requires shared resources, you have to move the dataset
back into the database. The problems that caching introduce in a highly
concurrent environment that require shared access outweigh the benefits. The
exception to this case is the asp net cache service. It's actually a viable
option because it outperforms sql.
> Got any ideaas then?
Come to think about it, I think the asp net cache service is a valid choice.
I can't see a reason why it won't address your problems. The only thing I
can see turning sour is the cost of serialization, but then again that is
the same with sql. There's another issue to with pages being loaded that
incur a page lock during the serialization process. That can block threads
if a page takes particularly long to execute.
> What's your DOS answer?
If you go that route, you'd need to somehow flag invalid responses and only
let in valid responses into your pipeline architecture. That way, even with
a DOS attack, it won't trigger your cache mechanism.
In a nutshell, there are no guarantees and it doesn't make your architecture
wrong. There are however, a few more things that you need to be wary of if
you decide to pursue your option. There may be ways to get your architecture
working in a scalable environment but you need to consider and plan for
these issues ahead of time.

Signature
Regards,
Alvin Bruney
------------------------------------------------------
Shameless author plug
Excel Services for .NET is coming...
OWC Black book on Amazon and
www.lulu.com/owc
> Hello Alvin,
>
[quoted text clipped - 173 lines]
>>>
>>> Martin