Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / ASP.NET / Caching / January 2007

Tip: Looking for answers? Try searching our database.

Potentail design for using ASP.Net Cache object in a web farm

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Martin - 31 Dec 2006 12:46 GMT
Hello all,

We know that designing a web application that is both scaleable and high
performance is difficult.

Scalability implies lots of web servers all referring back to a central SQL
server, which in turn implies limited caching which in turn hurts
performance opportunities.

Clearly there is no right answer for all scenarios, but I have been thinking
over a particular design which I would like to get your views on...

This scenario involves a collection of data which is concerned with an
overall user operation.  The data is persisted across multiple tables, but
the primary keys are hierarchical in nature. Eg house relates to rooms,
relates to furniture. The house primary key forms part of the room and
furniture primary keys.

I would like to use typed datasets for all the benefits they have, and I
would like to use timestamps to assist in concurrent edit checking.  One
dataset would hold the data for one house (plus related tables).
I would like to cache the datasets in the asp.net application.  If it times
out, so be it  I can go fetch a new version.
I would expect data edits to be applied to the database as part of the web
request operation, so dataset and database remain in sync.
I would not anticipate using Session state for this application.
I would cache data key in a client side cookie.

I require affinity to the specific cache and therefore web process (across
multiple servers and CPUs).  *In my view getting into cache synchronisation
across web servers will hurt the very performance gains we are trying to get
via caching in the first place.*

As a user becomes interested in a specific set of data (house) the datakey
cookie would be set, and this would drive the selection of web process that
is best suited to serve the request.  Consequently as the user works with
the site, different requests may be served by different web
processes/servers.  If the datakey cookie is not set, then no cache affinity
is required.

I have looked for some extension to Microsoft's Network Load Balancer using
a provider pattern to allow me to control the selection criteria of a
specific web process, but without success.
I want to take advantage of the NLB heart beat facility.  The scenario I
imagine is say a collection of four web processes (spread say across two
servers each with dual processor).  I *think* I can give each web process a
distinct url by using application pool configuration, but I haven't
confirmed this yet.

So I would expect my web process selection algorithm to be driven by the
value of the cookie holding a datakey.  The algorithm would distribute the
requests according to the data keys.  I was thinking something simple like
modulo 4 of the house ID in this scenario.  When a server goes down NLB
should know this, and expose this to my provider code.  My web process
selection algorithm would check the required web process is alive (refering
to NLB API), and make an alternate selection if necesary.

So far as I am aware, the piece of the picture that is missing is a provider
pattern API in NLB to facilitate this.  I wonder if this is something that
is on the drawing board at Microsoft (or a third party supplier for that
matter).

Apart from that piece missing, the main disadvantage I can see in this
design is it's defence against denial of service attack.  Theoretically
attackers would need to select just four distinct IDs that each hit a
different web process, however I believe the DOS risk is sufficiently small
that this design is still widely applicable.

Other issues that I know come into play include:
security of datakey
overhead of establishing potential ssl sessions on each new web process, as
datakey changes (I think this is relatively infrequent)
authentication cookies would need to apply to scope of entire web farm.
authorisation to access data would need to occur in the web application, not
just the database.

NB this design does not preclude distributed web farm clusters on different
continents (each cluster potentially caching the same data), because at the
end of the day if concurrent data edits are detected, the dataset can be
refreshed from the database, and the user can reconfirm their edit
operation.
Also in this scenario, there are likely to be multiple databases
synchronised using replication.  Typically the set of editable data would
configured for each database.

I would welcome a lively discussion on the viability on the design.

Thanks very much for your time.

Martin
Alvin Bruney [MVP] - 08 Jan 2007 03:47 GMT
See inline.

> Scalability implies lots of web servers all referring back to a central
> SQL server, which in turn implies limited caching which in turn hurts
> performance opportunities.
It most certainly does not. Caching avoids the SQL bottleneck.

> I would like to use typed datasets for all the benefits they have, and I
> would like to use timestamps to assist in concurrent edit checking.
timestamps won't help you with concurrency because the timestamp isn't
guaranteed accurate since windows is not a real time OS.
Even a minor lag will thru off your sync on a heavy traffic day.

> I would like to cache the datasets in the asp.net application.
Nope, cache is a poor choice because it is per process. You have a multi-cpu
architecture on a web farm. That leads to cache duplication.

> I would cache data key in a client side cookie.
What happens if cookies are lost, unreadable or client turns them off?

> multiple servers and CPUs).  *In my view getting into cache
> synchronisation across web servers will hurt the very performance gains we
> are trying to get via caching in the first place.*
Yes.

> As a user becomes interested in a specific set of data (house) the datakey
> cookie would be set, and this would drive the selection of web process
> that is best suited to serve the request.
Yes, but this is all driven by the client. Not a particularly good choice
since the client doesn't have to follow the rules you impose; that is, a
client can most easily disable cookies.

I *think* I can give each web process a
> distinct url by using application pool configuration, but I haven't
> confirmed this yet.
That doesn't solve your cache affinity problem.

> different web process, however I believe the DOS risk is sufficiently
> small that this design is still widely applicable.
You are pushing a cookie to the client, the wrong client can regenerate
multiple cookies that in turn drive the caching mechanism in your
architecture right?
Then, it's easy to flood the cache architecture from the client since every
request is valid.

Signature

Regards,
Alvin Bruney
------------------------------------------------------
Shameless author plug
Excel Services for .NET is coming...
OWC Black book on Amazon and
www.lulu.com/owc

> Hello all,
>
[quoted text clipped - 87 lines]
>
> Martin
Martin - 21 Jan 2007 11:40 GMT
Hello Alvin,

Are you disagreeing with the whole philosophy of using the cache to help
serve the request as close to the client as possible?

I appreciate this brings challenges in a webfarm environment, and that's
what I'm wanting to address.

If you have good web references to how you would approach the overall goal
of increased performance with caching in webfarms, that would be
interesting.

I've made individual comments inline.

Thanks
Martin

> See inline.
>
>> Scalability implies lots of web servers all referring back to a central
>> SQL server, which in turn implies limited caching which in turn hurts
>> performance opportunities.
> It most certainly does not. Caching avoids the SQL bottleneck.
The point I'm making here is that in a web farm environment, the standard
practice is to reference back to central db server *not* to use caching.
Using caching introduces new challenges which I'm trying to address.

>> I would like to use typed datasets for all the benefits they have, and I
>> would like to use timestamps to assist in concurrent edit checking.
> timestamps won't help you with concurrency because the timestamp isn't
> guaranteed accurate since windows is not a real time OS.
> Even a minor lag will thru off your sync on a heavy traffic day.
Here is a quote from  http://support.microsoft.com/kb/170380
"TimeStamp is a SQL Server data type that is automatically updated every
time a row is inserted or updated. Values in TimeStamp columns are not
datetime data; they are, by default, defined as binary(8) varbinary(8),
indicating the sequence of Microsoft SQL Server activity on the row. A table
can have only one TimeStamp column. The TimeStamp data type is simply a
monotonically-increasing counter whose values will always be unique within a
database.
"
What's wrong with that?

>> I would like to cache the datasets in the asp.net application.
> Nope, cache is a poor choice because it is per process. You have a
> multi-cpu architecture on a web farm. That leads to cache duplication.
I want to address this application pool configuration, but not sure if I
can.
What would you do?

>> I would cache data key in a client side cookie.
> What happens if cookies are lost, unreadable or client turns them off?
If they're turned off they could be put in the url (by an http module)
If they are lost or unreadable that would cause interference with the users
browsing experience.

>> multiple servers and CPUs).  *In my view getting into cache
>> synchronisation across web servers will hurt the very performance gains
[quoted text clipped - 12 lines]
>> confirmed this yet.
> That doesn't solve your cache affinity problem.
Doesn't it?  I've not tried yet.
Got any ideaas then?

>> different web process, however I believe the DOS risk is sufficiently
>> small that this design is still widely applicable.
[quoted text clipped - 3 lines]
> Then, it's easy to flood the cache architecture from the client since
> every request is valid.
I agree
What's your DOS answer?

>> Hello all,
>>
[quoted text clipped - 87 lines]
>>
>> Martin
Alvin Bruney [MVP] - 26 Jan 2007 00:56 GMT
> Are you disagreeing with the whole philosophy of using the cache to help
> serve the request as close to the client as possible?
In principle, yes because it causes more problems than it solves especially
for web farms. It is workable, but at what cost?

> If you have good web references to how you would approach the overall goal
> of increased performance with caching in webfarms, that would be
> interesting.
Actually, the patterns and practice group at MS has released the
authoritative work on that. I just happen to subscribe to what it preaches.

> The point I'm making here is that in a web farm environment, the standard
> practice is to reference back to central db server *not* to use caching.
It may be standard practice, but it is dead wrong with respect to
scalability.

> What's wrong with that?
In even a moderate concurrent environment, by the time you read the data it
may have already changed because of another update making your published
value stale.

> What would you do?
For a web farm, that requires shared resources, you have to move the dataset
back into the database. The problems that caching introduce in a highly
concurrent environment that require shared access outweigh the benefits. The
exception to this case is the asp net cache service. It's actually a viable
option because it outperforms sql.

> Got any ideaas then?
Come to think about it, I think the asp net cache service is a valid choice.
I can't see a reason why it won't address your problems. The only thing I
can see turning sour is the cost of serialization, but then again that is
the same with sql. There's another issue to with pages being loaded that
incur a page lock during the serialization process. That can block threads
if a page takes particularly long to execute.

> What's your DOS answer?
If you go that route, you'd need to somehow flag invalid responses and only
let in valid responses into your pipeline architecture. That way, even with
a DOS attack, it won't trigger your cache mechanism.

In a nutshell, there are no guarantees and it doesn't make your architecture
wrong. There are however, a few more things that you need to be wary of if
you decide to pursue your option. There may be ways to get your architecture
working in a scalable environment but you need to consider and plan for
these issues ahead of time.

Signature

Regards,
Alvin Bruney
------------------------------------------------------
Shameless author plug
Excel Services for .NET is coming...
OWC Black book on Amazon and
www.lulu.com/owc

> Hello Alvin,
>
[quoted text clipped - 173 lines]
>>>
>>> Martin

Rate this thread:







Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.