24 July 2005
Mike has requested my thoughts on 3-tier and the web – a topic I avoided in my previous couple entries (
1 and 2) because I don’t find it as interesting as a smart/intelligent client model. But he’s right, the web is widely used and a lot of poor people are stuck building business software in that environment, so here’s the extension of the previous couple entries into the web environment.
<?xml:namespace prefix = o ns = “urn:schemas-microsoft-com:office:office” />
In my view the web server is an application server, pure and simple. Also, in the web world it is impractical to run the Business layer on the client because the client is a very limited environment. This is largely why the web is far less interesting.
To discuss things in the web world I break the “Presentation” layer into two parts – Presentation and UI. This new Presentation layer is purely responsible for display and input to/from the user – it is the stuff that runs on the browser terminal. The UI layer is responsible for all actual user interaction – navigation, etc. It is the stuff that runs on the web server: your aspx pages in .NET.
In web applications most people consciously (or unconsciously) duplicate most validation code into the Presentation layer so they can get it to run in the browser. Thus is expensive to create/maintain, but is an unfortunate evil required to have a half-way decent user experience in the web environment. You must still have that logic in your actual Business layer of course, because you can never trust the browser - it is too easily bypassed (
Greasemonkey anyone?). This is just the way it is on the web, and will be until we get browsers that can run complete code solutions in .NET and/or Java (that's sarcasm btw).
On the server side, the web server IS an application server. It fills the exact same role of the mainframe or minicomputer over the past 30 years of computing. For "interactive" applications, it is preferable to run the UI layer, Business layer and Data Access layer all on the web server. This is the simplest (and thus cheapest) model, and provides the best performance[1]. It can also provide very good scalability because it is relatively trivial to create a web farm to scale out to many servers. By creating a web farm you also get very good fault tolerance at a low price-point. Using ISA as a reverse proxy above the web farm you can get good security.
In many organizations the reverse proxy idea isn’t acceptable (not being a security expert I can’t say why…) and so they have a policy saying that the web server is never allowed to interact directly with the database server – thus forcing the existence of an application server that at a minimum runs the Data Access layer. Typically this application server is behind a second firewall. While this security approach hurts performance (often by as much as 50%), it is relatively easily achieved with
CSLA .NET or similar architecture/frameworks.
In other situations people prefer to put the Business layer and Data Access layer on the application server behind the second firewall. This means that the web server only runs the UI layer. Any business processing, validation, etc. must be deferred across the network to the application server. This has a much higher impact on performance (in a bad way).
However, this latter approach can have a positive scalability impact in certain applications. Specifically applications where there’s not much interactive content, but instead there’s a lot of read-only content. Most read-only content (by definition) has no business logic and can often be served directly from the UI layer. In such applications the IO load for the read-only content can be quite enough to keep the web server very busy. By offloading all business processing to an application server overall scalability may be improved.
Of course this only really works if the interactive (OLTP) portions of the application are quite limited in comparison to the read-only portions.
Also note that this latter approach suffers from the same drawbacks as the thin client model discussed in my
previous post. The most notable problem is that you must come up with a way to do non-chatty communication between the UI layer and the Business layer, without compromising either layer. This is historically very difficult to pull off. What usually happens is that the “business objects” in the Business layer require code to externalize their state (data) into a tabular format such as a DataSet so the UI layer can easily use the data. Of course externalizing object state breaks encapsulation unless it is done with great care, so this is an area requiring extra attention. The typical end result are not objects in a real OO sense, but rather are “objects” comprised of a set of atomic, stateless methods. At this point you don’t have objects at all – you have an API.
In the case of CSLA .NET, I apply the mobile object model to this environment. I personally believe it makes things better since it gives you the flexibility to run some of your business logic on the web application server and some on the pure application server as appropriate. Since the Business layer is installed on both the web and application servers, your objects can run in either place as needed.
In short, to make a good web app it is almost required that you must compromise the integrity of your layers and duplication some business logic into the Presentation layer. It sucks, but its life in the wild world of the web. If you can put your UI, Business and Data Access layers on the web application server that’s best. If you can’t (typically due to security) then move only the Data Access layer and keep both UI and Business layers on the web application server. Finally, if you must put the Business layer on a separate application server I prefer to use a mobile object model for flexibility, but recognize that a pure API model on the application server will scale higher and is often required for applications with truly large numbers of concurrent users (like 2000+).
[1] As someone in a previous post indirectly noted, there’s a relationship between performance and scalability. Performance is the response time of the system for a user. Scalability is what happens to performance as the number of users and/or transactions is increased.