Open source software security

Designing a Data Driven Website - Part 1

30 November -0001
Originally posted: 19 February 2002
Last updated: 20 February 2006

Overview:

  • Choosing an Operating System
  • Deciding on a Scripting Language
  • Picking a Database

Choosing an Operating System

The first step in designing a data driven website is to choose your platform. This is possibly the most critical step in the design phase and will have long termm ramifications for your site in general. Operating system choice will sometimes be determined by hardware (such as in the case of Apple or Sun hardware), budget, or hosting arrangement. There are many available hosts online that will provide hosting (including a database and scripting language) for a monthly fee. In choosing a hosting provider, be sure to examine what technologies are available. You may also want to determine if your website will be large enough or need enough special features that you'll require a 'co-lo'. This is a collocated server, meaning that you either lease or own a server that is maintained in the hosting provider's environment. This option ensures autonomy and speed, but also costs a lot more. In this article I will limit my discussion to Windows Server and Linux.

Each brand of server software presents its own advantages and disadvantages. Both are rather cryptic systems and will require extensive familiarization to get proficient in administration. Out of the box, Microsoft's Windows platform is easier to use because of the point and click GUI options available. Linux provides more robust options at virtually no cost. I would say that all things being equal, the question of server platform will often depend on available resources. Windows Server is quite expensive (over $1000 last I checked) but comes with Microsoft support. Linux is free or virtually free and offers far more comprehensive online help but little actual technical support. There are several 'flavors' or 'distros' of Linux available (Red Hat, SUSE, Mandriva, Linspire, and Ubuntu just to name a few). At the core they all share the same kernel and are open source. Many distributions sell their products and provide robust support as well.

Deciding on a Scripting Language

The other consideration in deciding on platform operating system software addresses the second consideration of data driven web design: scripting language and database support. There are a myriad of scripting languages but the main ones are: Active Server Pages (ASP, now commonly referred to as ASP.Net (ay-ess-pee dot net)), Cold Fusion MX, Java Server Pages (JSP), Perl, and PHP. The most popular at this time seem to be ASP, Cold Fusion and PHP. There are several advantages and disadvantages to all.

ASP is Microsoft's native scripting language. Originally based on Visual Basic, if you have IIS (Microsoft Internet Information Server installed by default on Windows 2000 server) you already support ASP files. ASP is a robust language that works well but has a few disadvantages.

With the advent of .NET, Microsoft addressed many of the disadvantages of earlier ASP development. With .NET ASP is much more tightly integrated with other .NET languages (such as C#) and tools, making ASP.NET extremely robust, working well with the Windows platform on many levels. My main complaint about Windows languages is the rather closed community. Finding support online is often convoluted and finding good support documentation and code examples can be excruciating. If you are familiar with Visual Basic or other .NET languages, picking up ASP won't be very difficult. If, however, you are not proficient in Visual Basic or programming, ASP is a very difficult language to learn.

Also be aware that ASP is not portable. This means that once you have committed to developing your site in ASP you are restricted to a Windows hosting arrangement. With advanced technical help (specifically with the help of the Mono project) you could probably move your ASP pages to a linux or unix host, but this would be awkward and difficult.

Cold Fusion MX is Macromedia's web scripting language. Cold Fusion works on both Windows and Linux platforms, although it dovetails very nicely with Windows and Microsoft SQL server. Cold Fusion requires and additional expense above and beyond your operating system, but comes with a splendid development environment (DreamWeaver) that has become the benchmark for dynamic scripting editors. Cold Fusion leverages simpler tags and mark ups, and being Macromedia technology, ideally works well with Flash. Cold Fusion also provides and interactive Administrator to better handle your pages' performance and capabilities. Based on tags very similar in appearance to HTML, Cold Fusion is perhaps the easiest web scripting language to learn. Cold Fusion doesn't work all that easily on a Linux platform, however, and much of its capabilities for database integration are lost on Linux.

Java Sever Pages are often thought of as the web scripting language du jour, however, they are extremely powerful. They require the installation of the Java Runtime Environment on the server, and are a bit tricky to configure. Java provides enormous power and often compliments a robust, enterprise solution (like Oracle) quite well. The java community is strong and open, meaning it is easy to find skilled java programmers and support. JSP pages tend to take a bit more care in their planning and development, and deployment is more complex than simply uploading finished files to a server. Java is often the language developers turn to when there is a lot of heavy lifting to be done.

Perl is used more often for CGI than for actual web scripting but it can be done. Perl is a relatively simple language to learn but has extensive capabilities. Perl can be used as a scripting language on both Windows and Linux platforms with very little trouble. I believe it is easier to use Perl on an Apache server than on IIS. Perl seems to be somewhat awkward in rendering HTML and integrating with databases but is excellent for accomplishing complex CGI and server related tasks. With PHP 5 Perl support is native so it might be easier for many tasks to utilize the two in tandem.

PHP is one of the hottest new scripting languages around. PHP works best on Apache on either a Windows or Linux platform and has excellent support most databases. With the advent of PHP 5, the language has retained it's easy to learn roots but boasts excellent advanced features. PHP 5 takes PHP to a level on par with Java in terms of Object Oriented Programming and advanced language features. In fact, this has become a source of some criticism of PHP (sysadmins believe it to be so powerful that it is potentially dangerous, and PHP is approachable enough for a novice to pick up easily, allowing them to wield great power with little training). By far and away the best development environment for PHP is Zend Studio but PHP development is also possible with Macromedia Dreamweaver or even TextPad. PHP has extensive online support and code snippets. Because PHP is a free scripting language, it is often the choice of open source developers.

Other Choices

Other choices of scripting languages include Python, Ruby, PL-SQL (on Oracle) and many other smaller or more obscure technologies.

Picking a Database

Choosing your database back-end is the third major consideration for a data driven site and is often dictated by your choice of scripting language and server operating system. Since ASP works best on Microsoft IIS, you may be limited in choosing databases that run on Microsoft unless you want a separate database server. PHP, while it will function with ODBC, easily leverages many advanced features of open source or linux/unix native database solutions.

The three most database solutions I am most familiar with are Microsoft SQL Server, MySQL, and Oracle. A host of other solutions exist (DB2, SQLite, Postgre SQL, Microsoft Access, Apple's Filemaker, etc.) these are some of the most common. Each has its own set of advantages and disadvantages.

*A word about desktop databases. Often times it is tempting to use a desktop database system such as Access or Filemaker to support a dynamic website. While this is very possible, and actually extremely approachable, I would avoid using such technologies. The problem with many desktop database systems is that they don't scale well, meaning they could work fine as long as your site is small, but as it grows and you look to add new functionality you may hit a roadblock if you are using a desktop system. It is best to plan for the long term and choose a robust, enterprise level database, even if you don't plan to leverage its complete functionality right away.

Microsoft SQL server is expensive and requires an additional cost to install. It works beautifully in tandem with Microsoft Windows Server and ASP.NET, PHP and Cold Fusion. SQL Server provides great GUI's for data manipulation, transformation and maintenance. These GUI's also support access to remote SQL Servers across the internet via TCP/IP port 1433 (ODBC). SQL Server is robust and intuitive, easy to learn and reliable.

Oracle is often thought of as the high-end database solution. Oracle works best on a Linux/Unix environment, isn't very user friendly, and is a bear to install and maintain. Oracle is expensive but well worth the cost for extremely large and fast transactions. Oracle should really only be a consideration if you are thinking of working with tens of thousands of records at a time across systems that need high reliability and responsiveness.

MySQL is an open source solution. Although it lacks many of the features of Oracle and SQL Server, the price is definitely right - FREE. Up until version 5 MySQL had the major drawback of not enforcing referential integrity (it won't check to make sure that you aren't deleting a primary key that fills foreign key dependencies), but this could be fixed with careful application development. With the newest version of MySQL you *can* enforce referential integrity, as well as leverage a host of advanced features (such as triggers, stored procedures, etc.) that put MySQL on par with other proprietary databases. MySQL has no native GUI interface and must be accessed at the command line making data maintenance and transformation a bit of a hassle. There are GUI tools that you can download from the MySQL website, however, that make administration from a windows machine much easier. MySQL works easily on both Windows and Linux and integrates wonderfully with PHP and Perl.

Once you have made a decision as to what server operating system you wish to run, what database backend you wish to support, and the scripting language you will use, you are ready to roll. These choices will help you to choose a hosting provider or reflect your available assets. The next phase in designing your data driven website is the actual data modeling. This will ensure that as your site expands you will always be able to handle the data you need to. Some people consider data modeling an afterthought but poor data modeling can create headaches and even wreck a site down the road. Data modeling will be the subject of the second tutorial in this series.