Google Search

Google

Friday, November 2, 2007

Why PostgreSQL Instead of MySQL: Comparing Reliability and Speed in 2007

Introduction

For years, the common industry perception has been that MySQL is faster and easier to use than PostgreSQL. PostgreSQL is perceived as more powerful, more focused on data integrity, and stricter at complying with SQL specifications, but correspondingly slower and more complicated to use.

Like many perceptions formed in the past, these things aren't as true with the current generation of releases as they used to be. Both systems have evolved with landmark releases that make comparing the two a lot more complicated.

  • MySQL 5.0 (October 2005) finally added a "strict mode" to narrow the gap in terms of data integrity and spec compliance. It also added support for stored procedures, views, triggers, and cursors, all considered essential features for some classes of database deployments.
  • PostgreSQL 8.1 (November 2005) featured major performance improvements, particularly in scalability. Focusing on improving performance has been central to all the 8.X releases up to the current 8.2.

As innovation on these databases has progressed, each development community has actively made changes to address their respective sets of perceived disadvantages. The result that it has gotten more difficult to objectively determine which database is likely to be better suited for a given application. This document aims to clarify what situations PostgreSQL would be more appropriate for than MySQL, attempting to fairly compare the current production versions of each and discuss their strengths and weaknesses. The main areas covered here are the fundamental data integrity and speed issues of the core database software. Since it's often the case that you can trade-off performance against reliability, both these topics need to be considered together in order to get an accurate view of the landscape.

The position of this paper is that when the two are compared using the high level of data integrity demanded by a serious transactional database application, the current generation PostgreSQL performs similarly or better than MySQL (particularly under heavy user loads and with complex queries), while retaining its lead in the areas of SQL standards compliance and a rich feature set. It is also hoped that by exploring the differences between the two systems, you might come to appreciate how the fundamental approach of the PostgreSQL design team pervasively prioritizes reliable and predictable behavior. Similar portions of the MySQL implementation have some seams resulting from how features like transactional support and strict mode were added onto the software well into its design lifecycle rather than being integral from the start.

Compared Versions, Feature Sets, and Focus

The current production-ready versions as this is written in August of 2007 are PostgreSQL 8.2 and MySQL 5.0, and those are what's being compared here. Since both PostgreSQL 8.1 and 8.2 are currently supported versions with good performance, some comments here may refer to them collectively. 8.2 is moderately faster (perhaps as much as 30% so on some workloads), but deploying 8.1 is still a completely viable option right now, particularly because more operating systems vendors bundle and support it than the relatively new 8.2.

Both systems have newer versions in testing (PostgreSQL 8.3 and MySQL 5.1) at this time that offer incremental improvements to some areas this document addresses, but there are no fundamental differences so large in either new version that it's believed the general guidelines here would be invalidated. The official release of the MySQL Falcon engine in the future is one likely disruptive point on the horizon. The Asynchronous Commit feature in PostgreSQL 8.3 is another upcoming event that will shift the reliability/performance trade-off options available significantly.

What is specifically not addressed here are the feature sets of the two products in areas outside of these fundamentals. Because the scale of the changes in PostgreSQL 8.1 and MySQL 5.0, many of the documents covering this topic are too out of date to recommend. Some pages that may be helpful include:

  • Open Source Database Feature Comparison Matrix
  • MySQL vs. PostgreSQL
  • MySQL Development Roadmap
  • PostgreSQL Feature Matrix

While feature checklists are useful, some system behaviors require a fairly deep understanding of the respective systems to appreciate. For instance, the internals of how PostgreSQL compresses TOAST data are invisible to the user, but can result in a startling improvement in system performance with certain types of data.

Another area outside of the scope of this document is that more applications support MySQL as the database of choice than PostgreSQL, and certainly that is an important factor for deciding which of these databases is better suited for a particular situation. Work on adding PostgreSQL support to some popular applications can be tracked at Software Ports. One thing you should evaluate when considering how applications use MySQL is that if they were initially targeted at versions before 5.0, they may not be compatible with newer features like the strict mode introduced in that version. If that's the case, such applications may be limited to the capabilities of the older version they were written against, and it may require a sort of porting effort to take advantage of the modern MySQL features.

Reliability

Data Integrity

Before version 5.0, MySQL well deserved its reputation for allowing inconsistent data to be inserted into the database. Guaranteeing Data Integrity with MySQL 5.0 explains the issues with older MySQL versions, and how they may be addressed using the strict SQL ModeWhen MySQL Bites: Quirks to Watch Out For.
available in the current version. Of course, any MySQL client is allowed to change its own SQL Mode to override this, with the result that these validation constraints are still not necessarily enforced by the server. Some good examples on this topic can be found at

PostgreSQL has always been strict about making sure data is valid before allowing it into the database, and there is no way for a client to bypass those checks.

Transactions and the Database Engine Core

The database core that gave MySQL its original reputation for speed is MyISAM. This engine has excellent read performance and its parser is very efficient for straightforward queries, which combine to make it very fast in read-intensive applications like web applications involving simple SELECTs. However, it is commonly known that MyISAM is more vulnerable to data corruptionACID properties. MyISAM also has issues dealing with concurrent reads and updates, since it only provides table level locking.
than most serious database applications would tolerate, and after a crash there may be a substantial delay while it rebuilds its indexes before the server can restart. Furthermore, it does not support foreign keys or transactions that would allow the database to have

The integration of the InnoDB Storage Engine to MySQL greatly improved over MyISAM in terms of data integrity, adding a more robust log replaying mechanism for crash recovery and enabling ACID compliant transactions. However, this new approach comes with much more overhead, and InnoDB tables are not as fast as MyISAM ones for pure read loads. In addition, the internal MySQL metadata tables are still stored using MyISAM, which means they remain vulnerable to the traditional corruption issues associated with that storage engine. This issue is worked around using some complicated locking methods that have the potential to make a table alteration block for some time.

You should also be aware that it's possible in some environments (typically shared web hosting) to create what you believe to a transaction-safe InnoDB table, but actually get non-ACID MyISAM instead. As is too often the case with MySQL, this will not generate an error, it will quietly do the wrong thing instead. See Whoops, no InnoDB table support for details about how to confirm you got what you wanted when creating your tables on a system that that may be running an older MySQL version.

PostgreSQL has always focused on data integrity at the transaction level, keeping locking issues to a minimum, and barring hardware failure or grossly improper configuration it is difficult to corrupt a database.

It is worth observing that the database engine is part of the core of PostgreSQL, whereas InnoDB is a dual-licensed product presently licensed from Oracle Corporation. It is uncertain how Oracle may alter InnoDB in the future as they act in competition with MySQL AB, whereas PostgreSQL has no such conflict of interests. MySQL AB has been working on a new database engine core called Falcon in order to free themselves from this situation, but historically developing a database core engine that is both fast and reliable has required many years of work and testing before a mature product suitable for production use is available. Initial benchmarks
suggest Falcon has plenty of rough edges that need to be addressed.

Foreign Keys

Proper implementation of design techniques like Database Normalization rely on the ability of the database to use Foreign keys to map relationships between tables. In MySQL, foreign keys are only supported with InnoDB. One problem with their implementation is that it is limited and will silently ignore some standard syntax. For example, when creating a table, even in the upcoming 5.1 release of MySQL "the CHECK clause is parsed but ignored by all storage engines". The basic design philosophy of PostgreSQL is to produce errors or warnings in similar situations where an operation is ambiguous or unsupported.

Transactional DDL

In PostgreSQL, when you are inside a transaction almost any operation can be undone. There are some irreversible operations (like creating or destroying a database or tablespace), but normal table modifications can be backed out by issuing a ROLLBACK via its Write-Ahead LogDDL like table creation. design. That supports backing out even large changes to

MySQL doesn't support any sort of rollback when using MyISAM. With InnoDB, the server has an implicit commit that occurs even if the normal auto-commit behavior is turned off. This means that any single table alteration or similar change is immediately committed.

Experienced PostgreSQL DBA's know to take advantage of its features here to protect themselves when doing complicated work like schema upgrades. If you put all such changes into a transaction block, you can make sure they all apply atomically or not at all. This drastically lowers the possibility that the database will be corrupted by a typo or other such error in the schema change, which is particularly important when you're modifying multiple related tables where a mistake might destroy the relational key. There is no way to similar way to safely adjust multiple schema sections with MySQL.

See Transactional DDL in PostgreSQL: A Competitive Analysis for detailed examples demonstrating these differences.

Speed

Default configuration

Historically, the initial PostgreSQL configuration was designed to support older flavors of UNIX where allocating large amounts of memory wasn't necessarily possible. The result was that its use of memory for caching results was, by default, very pessimistic. On modern systems that have lots of memory available, this severely hinders untuned PostgreSQL performance.

The defaults have gotten much less pessimistic in recent releases. System configuration is now examined at database initialization time and more memory allocated if it is possible to do so. As a result, the untuned default configurations on recent PostgreSQL versions perform significantly better than older versions. In addition, changes in cache management in versions 8.1 and 8.2 allow even modest amounts of cache to be used more effectively than they used to be.

The primary tunable for both database systems works similarly, by allocating a block of shared memory dedicated to the database. MySQL tunes this with key_buffer_size when using MyISAM, and with innodb_buffer_pool_size when using InnoDB (note that you still need some MyISAM space for system tables even when InnoDB is the main storage engine for regular tables). PostgreSQL sizes its main memory space with shared_buffers.

The MySQL key_buffer_size defaults to using 8MB of memory. Earlier PostgreSQL configurations would also allocate 8MB of memory for the shared_buffers cache if possible. On a server like a current generation Linux system, it's expected the recent PostgreSQL releases would set shared_buffers to at least 24MB by default when the database cluster is created.

It is still worthwhile to go through the configuration files to tune them to match the available memory on a database server, as all these defaults are dramatically undersized compared to the amount of RAM in current systems. For a modern dedicated server, the rule of thumb for both PostgreSQL and MySQL is to size the dedicated memory to at least 1/4 of the total RAM in the machine, perhaps increasing to as much of 1/2 of RAM on the high side of normal. It's not out of the question to push this percentage even higher when using systems with very large amounts of RAM; MySQL InnoDB guidelines suggest even 80% isn't unreasonable. Performance comparisons using the defaults with either database are completely unrealistic of how a real system would be configured. Initial guidelines in this area can be found at 5-Minute Introduction to PostgreSQL Performance, Optimizing the mysqld variables and Optimizing the MySQL Server.

Benchmarks

Benchmarks are very difficult to do well; creating truly comparable benchmarks is a complex art. Many of the older performance benchmarks that have shown MySQL to be much faster than PostgreSQL have suffered from a number of problem areas:

  • Configuration: It's not unheard of to see a a tuned MySQL compared to an untuned PostgreSQL instance. As mentioned above, untuned PostgreSQL used to be particularly pessimistic about what resources it had available. A truly fair comparison would match the amount of memory used by each system.
  • Transaction support: MyISAM benchmarks involve "transactions" that provide none of the ACID guarantees that PostgreSQL offers. This would frequently mean that apples were being compared to oranges.
  • Transaction grouping: Related to the above, PostgreSQL would sometimes be hindered in naive benchmarks that don't properly group transactions the way a real application would. That can add the overhead of not just one transaction, but perhaps hundreds of thousands, to the cost of doing updates.
  • Serial versus concurrent behaviour: A number of the behaviors of MyISAM are tuned for having a single user accessing the database. For example, its use of table locks to control access to tables means that under heavy user loads, it will slow dramatically. PostgreSQL degrades more gracefully with large numbers of simultaneous connections. Beware of naive benchmarks that involve a simple stream of database requests across a single connection.

Sun Microsystems 2007 jAppServer2004 Benchmark Results

Sun Microsystems, a neutral vendor selling hardware that runs many database types, has recently submitted test results on the well regulated SPECjAppServer2004 using both PostgreSQL and MySQL. There are just enough hardware differences between the two systems that it isn't fair to directly compare the two results. But the fact that both scores are close to one another and the configuration is similar does suggest that while there may be performance differences between the two database systems, the magnitude of that difference is not particularly large with this application type.

For comparison sake, an Oracle on HP result offers a similar magnitude of performance on less impressive hardware, suggesting both open-source databases still lag the best of the commercial products in absolute performance efficiency. Some suggest Oracle's lead is even larger if you pick examples to put it in a better light, but be sure to read Benchmark Brou-Ha-Ha for some comments on actual pricing here (and to pick up some comments on a second PostgreSQL resultPostgreSQL Core Team, and his commentary should be evaluated accordingly.
using a smaller server). Note that Josh Berkus is a Sun employee whose role there includes being a member of the

If you do a fair comparison that includes software licensing costs, the performance per dollar figures for both PostgreSQL and MySQL are both similar to one another and very good relative to the average for the database industry. It would however be untrue to say that these open-source solutions are always a better choice than commercial offerings like Oracle just based on that; certainly the feature sets and absolute performance of each solution need to be considered as well.

Transaction Locking and Scalability

PostgreSQL uses a robust locking model called
MVCC that limits situations where individual clients interfere with each other. A short summary of the main benefit of MVCC would be "readers are never blocked by writers". MVCC is used to implement a pessimistic implementation of the four SQL standard transaction isolation levels: "when you select the level Read Uncommitted you really get Read Committed, and when you select Repeatable Read you really get Serializable, so the actual isolation level may be stricter than what you select." The default transaction isolation level is "read committed".

MySQL's InnoDB implements MVCC using a rollback segment, inspired by Oracle's design; their new Falcon engine works similarly. InnoDB databases supports all four SQL standard transaction isolation levels, with the default being "repeatable read".

When comparing the two models, PostgreSQL enforces client separation where the data operated on is always consistent under all circumstances; as the MVCC documentation states, "the reason that PostgreSQL only provides two isolation levels is that this is the only sensible way to map the standard isolation levels to the multiversion concurrency control architecture." MySQL allows configurations where client code that doesn't commit transactions properly can result in a data view that would be considered inconsistent by PostgreSQL's stricter standards. However, in situations where it's acceptable for data being read to have small inconsistencies, being able to use a less strict locking could be a performance advantage for MySQL.

Even when both systems are configured to one of the strict levels of transaction locking, the differences between the two implementations are subtle enough that which implementation will work better for a particular application is hard to state definitively. Recommended reading to understand this complicated topic is
"Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control" by Weikum & Vossen. Speaking in the terminology used there, PostgreSQL uses multi-version timestamp ordering (MVTO) while InnoDB and Oracle use multi-version read consistency (MVRC). The main difference is that PostgreSQL is with-REDO/no-UNDO because it stores every row version in the main table, while Oracle/InnoDB implements with-REDO/with-UNDO where they reconstruct a block and/or row image from the log to provide read consistency. If you're willing to consider a third architecture, that of IBM's DB2, as a comparison point additional good references on this topic are A not-so-very technical discussion of Multi Version Concurrency Control and Leverage your PostgreSQL V8.1 skills to learn DB2. IBM is clearly not a fan of the MVCC approach.

Partially because the PostgreSQL locking implementation is very mature (it's always active and performance of the associated code is accordingly critical), even in situations where MySQL initially appears faster PostgreSQL can pull ahead and scale to higher throughput when the number of simultaneous users becomes large. A good example of such a situation is demonstrated in the tweakers.net database test.

Counting rows in a table

One operation that PostgreSQL is known to be slow performing is doing a full count of rows in a table, typically using this SQL:

SELECT COUNT(*) FROM table

The reason why this is slow is related to the MVCC implementation in PostgreSQL. The fact that multiple transactions can see different states of the data means that there can be no straightforward way for "COUNT(*)" to summarize data across the whole table; PostgreSQL must walk through all rows, in some sense. This normally results in a sequential scan reading information about every row in the table.

Some DBMSes provide the ability for "COUNT(*)" queries to work via consulting an index. Unfortunately, in PostgreSQL, this strategy does not work, as MVCC visibility information is not stored at the index level. It is necessary to actually examine the rows themselves to determine if they are visible to the transaction or not.

In MySQL, MyISAM tables cache the row count information, making this type of count operation almost instant. That is the reason why there exists so much MySQL code that uses this construct assuming it's a trivial operation. But if you're using InnoDB instead, this is no longer the case. See
COUNT(*) for Innodb Tables and COUNT(*) vs COUNT(col) for notes on the limitations of MySQL in this area. MySQL designs that may be deployed on InnoDB can't assume that a full row count will be fast, and therefore are hampered by similar limitations to those present in PostgreSQL.

It is worth observing that it is only this precise form of aggregate that must be so pessimistic; if augmented with a "WHERE" clause like

SELECT COUNT(*) FROM table WHERE status = 'something'

PostgreSQL, MySQL, and most other database implementations will take advantage of available indexes against the restricted field(s) to limit how many records must be counted, which can greatly accelerate such queries. PostgreSQL will still need to read the resulting rows to verify that they exist; other database systems many only need to reference the index in this situation.

One popular approach for applications that need a row count but can tolerate it not including transactions that are in the middle of being committed is to use a trigger-based mechanism to count the rows in the table. In PostgreSQL, another alternative when only an approximate count is needed is to use the reltuples field from the pg_class catalog table.

Join Complexity

PostgreSQL uses cost-based query optimization methods in order to get good performance for many different types of joins. Query costs are assessed based on planner statistics collected when tables are analyzed combined with adjustable planner costs, and advanced features such as the Genetic Query Optimizer allow optimizing even very complicated joins efficiently.

MySQL doesn't have this level of sophistication in its planner, and the tunables for Controlling Query Optimizer Performance are crude. Developers instead do things like explicitly provide index hints to make sure joins are executed correctly. To make this task easier, MySQL provides a Query Profiler that is easier to work with than typical EXPLAIN data. Regardless of hinting, subselect optimization is a known weak spot in MySQL. There is also a fairly serious subquery null handling bug in MySQL 5.0 (which at this time appears to be still present in 5.1).

Finding order in execution provides several comparisons of how the two databases handle queries differently. Because of its more robust automatic optimization, PostgreSQL usually does a better job of handling complicated joins than MySQL--but only if the planner is properly configured (setting the effective_cache_size tunable too small is one common mistake) and statistics about the tables are kept up to date (typically via auto-vacuum). The fact that you must give the PostgreSQL optimizer correct information to work with, and can't explicitly control which join it uses, is a somewhat controversial design decision. The core PostgreSQL developers feel that it's more important to focus on improving the optimizer so it works correctly in all cases instead of just allowing queries to hint at a plan as a workaround for problems.

There are some add-on tools some find useful for exploring the PostgreSQL planner. pgAdminsample). Another option is Visual Explain, originally a RedHatEnterprise DB. It comes bundled with the EnterpriseDB Advanced Server package and can be built to run against other PostgreSQL installations using the source code to their Developer Studio package. includes an explain plan viewer ( component that has been kept current and improved by

Credits and Feedback

This document was written by Greg Smith with substantial contributions by Christopher Browne, Lukas Kahwe Smith, and other members of the PostgreSQL Advocacy mailing list. Some of the references linked to by this document point to articles also written by these authors.

Corrections, suggestions, flames, and similar feedback should be addressed to Greg, an independent consultant whose only affiliation with The PostgreSQL Global Development Group consists of submitting patches to improve the upcoming 8.3 release. He feels that PostgreSQL stands on its own merits and comparisons with MySQL should be as factual as possible, and feedback will be treated accordingly.

Courtesy : Greg Smith (Software consultant and author of the PostgreSQL Performance Pitstop site.)

Thursday, October 4, 2007

Oracle and Shell Scripting

Two very integral part for any DBA (Oracle from shell scripts), and a well written artical by Casimir Saternos

An Introduction to Linux Shell Scripting for DBAs
by Casimir Saternos

Learn some basic bash shell scripts for installing, running, and maintaining Oracle databases on Linux.

Published November 2005

About seven years ago, Oracle released the first commercial database on Linux. Since then, Oracle, Red Hat, and Novell/SUSE have been steadily collaborating on changes to Linux kernel as they relate to database and application performance. For that reason, Oracle Database 10g for Linux includes enhancements that are closely related to the operating system. Now more than ever, DBAs need to have knowledge of and experience on this platform to best administer the systems under their watch.

There is a traditional division of responsibilities between sysadmins and DBAs. However, in practice, the distinction is not always clear. Many IT shops employ individuals who address concerns at the database as well as the operating system levels. And of course the Oracle Database itself uses operating system resources and is designed to interact closely with its environment.

Furthermore, many sysadmins and DBAs find it necessary or convenient to automate tasks related to their work. The installation of software, monitoring of system resources, and management of systems involve repetitive and error-prone tasks are better addressed through automated processes than manual procedures.

One method that is used to automate such tasks is shell scripting. Shell scripts play a significant role in the Linux System from the time it is installed. Various scripts are called when the system is started up and shut down. Utilities by Oracle and other third-party vendors are invoked through shell scripts. Because they can be developed quickly, they have historically been used for prototyping applications. System Administrators have taken advantage of the functionality available through shell scripting to provide solutions that are tailored for the particular requirements and idiosyncrasies of the systems under their watch.

In this article, I will introduce functionality available through "bash" shell scripting that is relevant to installing, running, and maintaining Oracle databases on a Linux platform. Note that this article is designed for Linux scripting beginners or DBAs who are relatively new to Linux; most experienced Linux sysadmins will not find it helpful.

What Is a Shell Script?

A shell script is simply a text file containing a sequence of commands. When you run the file—or script—it executes the commands contained in the file. The term shell simply refers to the particular command-line user interface you use to communicate with the Linux kernel. Several different shells are available, including the C shell (csh), Korn shell (ksh), Bourne shell (sh), and Bourne-Again shell (bash). The shell itself is a command that reads lines from either a file or the terminal, interprets them, and generally executes other commands. The Bourne-Again shell incorporates features of the other shells mentioned and is the one that was used for the scripts this article demonstrates.

The first line in the script file can be used to dictate which shell will be used to run the script. This is the meaning of the first line found in all of the script examples:

#!/bin/bash
Why Use Shell Scripts?

Depending on your background, you may not see any immediate value to shell scripting as it relates to the DBA’s work. If you do not have experience with UNIX or UNIX-like systems, the myriad of cryptic commands might be cause for concern. Besides, in addition to being a relational database, Oracle 10g provides a robust platform for processing data within the database as well as several methods of interacting with the OS outside of the database.

However, there are several reasons you might find yourself delving into the world of shell scripting, including the following:

  • You are in a situation in which you must support already existing scripts.

  • You want to automate system setup that occurs prior to the installation of the Oracle software. For instance, you could write a script to check the initial state of the OS and report any prerequisites that must be met before installation of the software. The script might also create relevant OS users and groups and set environmental variables for the users.

  • A running Oracle database can be used to execute manual or scheduled tasks. However, some tasks need to be run when the database is not running. You can use a script to stop or start a database (as well as a listener or a related database process). Such an action cannot be initiated from within the database itself.

  • You need a mechanism for monitoring the state of a database (i.e. if it is running and available to process queries). Such a script could also monitor other processes and resources that are not Oracle-specific to provide a fuller picture of what is occurring in the system.

  • You need to automate backups. Oracle Recovery Manager (RMAN) is a utility that allows you to develop backup scripts that can run on any platform. You can call Oracle Recovery Manager from a shell script and use it to perform a wide range of backup and recovery activities.

  • You might have a requirement that is not specific to a single database. Perhaps you have several databases on a single machine. It may not be advisable to fulfill the requirement by using a single database, and it introduces potential security issues as well. Under such circumstances, shell scripting provides a means for fulfilling the requirement in a manner that does not associate the process with a single database.

When Not to Use Shell Scripts

Oracle Database includes functionality that extends beyond the traditional definition of an RDBMS. Like any other piece of software, it uses resources that are provided by the operating system, but it can “see” and “change” its environment to a much greater degree than other software. SQL and Oracle’s fixed views provide a picture of the system from inside the database, whereas shell scripting provides a view of the system from outside of the database. Shell scripting is not the solution for every problem.

It important to recognize that many aspects of the operating system can be monitored and modified from within the database. Oracle’s fixed views (the views with a v$ prefix) can be used to determine the host name of the machine (v$instance) or the platform name on which the database is running (v$database). The location and other attributes of the files associated with the database can be determined in this manner as well. The location and other attributes of datafiles (v$datafile, dba_data_files), temp files (v$tempfile, dba_temp_files), redo logs (v$logfile), archive logs (v$archived_log), and control files (v$controlfile) can be queried directly from the database. You can determine information about the flash recovery area ($recovery_file_dest) from this view, as well as by looking at some init.ora parameters (db_recovery_file_dest, db_recovery_file_dest_size). The status of processes (v$process) and memory (v$sga, v$sgastat, and so on) can be queried as well. There are various built-in PL/SQL packages as well as the ability to create Java and C database objects that allow for additional access to the underlying OS.

If you are considering scripting for a task that requires a good deal of database access, scripting is probably not your best option. Later in this article, there is a description of how to access the database by using SQL*Plus but in many cases, you are better off approaching the problem by using another language.

The charts below summarize information accessible from within the database itself:

Server/OS Information

Server identification

Representative Query

Notes

Host name where the instance is running

select host_name
from v$instance;

You can also obtain this information by running the following from bash:

hostname

or

uname –n

Operating system platform

select platform_name from v$database –-(10g)

Similar information is returned if you run uname –s

File Information

Oracle file locations

Representative Query

Notes

Control files

select name
from v$controlfile;

Location of the database control files. The init.ora parameter control_files also contains this information.

Datafiles

select file_name
from Dba_data_files;

Location of the database datafiles

Temp files

select file_name
from Dba_temp_files;

Location of database temporary files

Log files

select member
from v$logfile;

Location of redo logs

Archived logs

select name
from v$archived_log

Location of archived redo logs. The init.ora parameters log_archive_dest_n also contain this information. This query will not return results if your database is not in Archivelog mode.

Flash recovery area

select name
from v$recovery_file_dest

The directory being used in an Oracle 10g installation for the flash recovery area. The init.ora parameter db_recovery_file_dest also contains this information.

Other points of access on the file system indicated by parameters

select *
from v$parameter

where value like '%/%'

or

value like '%/%';

The results of this query can vary significantly, depending on your installation and version of Oracle Database. Parameters that may be returned include:

spfile
standby_archive_dest
utl_file_dir
background_dump_dest user_dump_dest
core_dump_dest
audit_file_dest
dg_broker_config_file1
dg_broker_config_file2

Programmatic access to the file system

select directory_path from dba_directories

The Oracle UTL_FILE_DIR parameter and DIRECTORY database objects can be used to access files that are not a part of standard database functioning.

Process Information

Processor/Processes

Representative Query

Notes

Session Processes

select p.spid, s.username, s.program

from v$process p, v$session s

where p.addr=s.paddrorder by 2, 3, 1

The spid can be correlated with ps –ef results to compare the information available within the database with OS information for a given process.

Processes related to parallelism

select slave_name, status
from v$PQ_SLAVE

Many aspects of Oracle Database, such as loading, querying, object creation, recovery, and replication, can take advantage of parallelism to speed up activities that can be broken down. The parameter parallel_threads_per_cpu sets an instance’s default degree of parallelism.

Memory Information

Memory

Representative Query

Notes

Program Global Area

select * from V$PGASTAT

The parameter pga_aggregate_target is used to configure memory for all dedicated server connections.

Linux utilities, including vmstat and top, can be used to monitor memory usage.

System Global Area

select * from v$sga



The parameters SGA_MAX_SIZE and SGA_TARGET are used to configure dynamic memory allocation features of Oracle Database 10g. Other parameters can be used to manually allocate memory for particular purposes.

Again, various Linux utilities are available to monitor memory allocation.

BASH Scripts

Scripts are either called as part of an automated process (with no human intervention) or run interactively, with a user responding to prompts. As long as you have executable permission for a file, you can run it from the command line by typing its name. If you do not have executable permission for the file but do have read permission, you can run the script by preceding it with sh.

If a script is designed to be run without user input, several options are available for calling it. You can run a script in the background and continue even if you disconnect, by entering a command in the following form:

nohup /path_to_dir/myscript_here.sh &
This can be useful for running scripts that take a long time to complete. The at command can be used to execute a script in the future, and cron can be used to schedule scripts to execute on a recurring basis.

The following examples cover the essential aspects of providing output to view (using echo), looping, conditional logic, and variable assignment.

print_args.sh. Arguments are words to the right of the command name that are passed into the script. To access the first parameter, you use the $1 variable. The $0 variable contains the name of the script itself. The $# variable contains the number of arguments in the script. A handy way to iterate through all of the parameters passed involves the use of a while loop and the shift command. This command is what lets you iterate through all the arguments in the argument list (rather than remaining in an infinite loop).

while [ $# -ne 0  ]
do
echo $1
shift
done
If a script takes a filename as an argument (or prompts a user for a filename) and the file will be read later in the script, it is advisable to check whether it is accessible and readable. For example, a recovery script that involves the selection of a backed-up control file might prompt the user to make a selection that will be used later in the script to restore the file.
if [ ! -r $1 ]; then # not exists and is readable
echo "File $1 does not exist or is not readable."
exit;
fi
The sequence of characters
 
if [ ! -r $1 ];
is what actually performs the test. If the contents between the brackets evaluate to true, the commands that appear between if and fi will be executed. The actual test appears between the brackets. The exclamation point serves to negate the test you are doing. The -r option checks to see if the file is readable. What is being tested in this particular case is the first argument being passed to the script. By using a different test ( -d ), you can check to find out if a given entry is a directory (see is_a_directory.sh).

do_continue.sh. This example is a simple representative sequence of commands that can be used to read user input for various purposes. Before running a process that can result in data loss or other undesirable results under certain conditions that are not determinable from within the script, it is advisable to include a prompt asking if the user actually wants the script to execute the next command or commands. The following example asks if the user wants to continue, reads a variable named doContinue from the command line, and evaluates what the user entered. If the user enters anything other than “y,” that person is informed that the script is “quitting” and it exits without executing the remainder of the script following the end of the if block (fi).

doContinue=n
echo -n "Do you really want to continue? (y/n) "
read doContinue

if [ "$doContinue" != "y" ]; then
echo "Quitting..."
exit
fi
It is imperative that only users with the correct permissions and environment run a given script. A useful check in a script tests the user who is attempting to run the script. If you enclose a command within back-quote () characters, the results of the command can be returned to the script. The following example retrieves the currently logged-on user, by using whoami, and displays the date, by using the date command later in the script.
 
echo "You are logged in as ‘whoami‘";

if [ ‘whoami‘ != "oracle" ]; then
echo "Must be logged on as oracle to run this script."
exit
fi

echo "Running script at ‘date‘"
Scripts written to interact with Oracle Database sometimes require the entry of sensitive information such as a database password. The stty –echo command turns off the screen echo, so that the information entered for the subsequent read command will not appear on the screen. After the sensitive information has been read and stored in a variable (pw in the example below), the display can be turned back on with stty echo.
 
stty -echo
echo -n "Enter the database system password: "
read pw
stty echo
Oracle Scripts

Some files reside in a fixed location for a given Oracle install. You can determine the Oracle inventory by viewing the /etc/oraInst.loc file. The /etc/oratab file identifies databases (and other Oracle programs) installed on the server.

get_inv_location.sh. This script is a bit less intuitive than the previous examples. By breaking down the script into its component commands, you will get a better understanding of what is being accomplished.

To determine inventory location, you are going to pipe the results of the cat command (which displays the contents of the file) to grep (a utility that prints lines that match a given pattern). You are searching for lines that contain the literal inventory_loc .

cat /etc/oraInst.loc | grep inventory_loc
If there is more than one inventory location due to multiple installs, you want to exclude lines commented out with a # . The –v option excludes lines that contain the given pattern.
 
cat /etc/oraInst.loc |grep -v "#"|grep inventory_loc

The result of this command will look something like this:

inventory_loc=/u01/oraInventory
You can redirect standard out to a file, using the > redirection. If the file does not exist, it is created. If it does exist, it is overwritten.

 
cat /etc/oraInst.loc|grep -v "#"|grep inventory_loc > tmp
Once you have the record that indicates the inventory location, you want to remove the portion of the record prior to the equal sign. This time you pipe the results of the cat command to awk (a pattern-scanning and processing language often used to split up variable-length fields), which essentially tokenizes the string. The –F option directs awk to use the equal sign as the delimiter. You then print the second token of this string ($2 ), which is everything to the right of the equal sign. The result is our inventory location (/u01/oraInventory).
cat tmp | awk -F= '{print $2}'
There is no particular reason to allow the temporary file (tmp) to remain, so it can be removed.

rm tmp
list_oracle_homes.sh. If you would like to determine the ORACLE_HOME for a given database, you have several options. You can log on as the database user and echo the $ORACLE_HOME variable. You can also search through the /etc/oratab file and select the name associated with a given instance. Database entries in this file are of the form
 
$ORACLE_SID:$ORACLE_HOME::
The following one-liner prints out the ORACLE_HOME of an entry with an ORACLE_SID of TESTDB:
cat /etc/oratab | awk -F: '{if ($1=="TESTDB") print $2 }'

However, what if you have a requirement that an operation needs to be performed on each ORACLE_HOME listed in the /etc/orainst file? You can iterate through such a list by utilizing the following code snippet.

dblist=‘cat /etc/oratab | grep -v "#" | awk -F: '{print $2 }'‘

for ohome in $dblist ; do
echo $ohome
done
The dblist variable is being used as an array. All ORACLE_HOME paths are held by this variable. A for loop is used to iterate through this list, and each entry is assigned to the variable ohome and then echoed to standard out.

search_log.sh. A variety of logs are generated by Oracle products, and you might be interested in monitoring them. The database alert log contains messages that are critical to database operations. Log files are also generated when products are installed or deinstalled and when patches are applied. The following script iterates over a file passed to it as an argument. If any lines are found that contain ORA-, an e-mail message is sent to a designated recipient.

 
cat $1 | grep ORA- > alert.err

if [ ‘cat alert.err|wc -l‘ -gt 0 ]
then
mail -s "$0 $1 Errors" administrator@yourcompany.com <>
The specific test being performed is a count of the number of words that exist in the file alert.err, which is written when you redirect to alert.err. If the word count (wc) is greater than (-gt) zero, the contents of the if block will execute. In this case, you are using mail (send mail might also be used) to send a message. The title of the message contains the script being executed ($0), the name of the log being searched ($1), and the lines that matched our initial search (ORA-) as the body of the message.

Environmental variables such as ORACLE_HOME, ORACLE_BASE, and ORACLE_SID can be used to locate resources that are not in a fixed location in the Linux environment. If you are administering an Oracle E-Business Suite 11i application instance, you have numerous other environmental variables that can be used to locate resources. These include APPL_TOP, TWO_TASK, CONTEXT_NAME, and CONTEXT_FILE, to name a few. To see a complete list in your environment, execute the following command and examine the resulting file (myenv.txt):

env > myenv.txt
Various combinations of these environmental variables can be used as the location of a file being searched. For example, an alert log location might be designated as
 
$ORACLE_BASE/admin/$ORACLE_SID/bdump/alert_$ORACLE_SID.log

Based on the principles introduced in this script, a larger one can be written and scheduled to execute at periodic intervals that will search the contents of the alert log (or another file of interest) and send an e-mail if any errors exist. Then the contents of the log can be moved to another file, so that only the most recent error messages will be sent via e-mail.

Oracle Recovery Manager Scripts. Oracle Recovery Manager (RMAN) is a utility that can be used to administer database backup and recovery. This greatly simplifies administration on multiple platforms, because all backup scripts can be written to be run by RMAN, reducing the amount of platform-specific code. RMAN can be called by the underlying operating system and passed a script. For example, a cold (cold.sh) backup might consist of the following script:

#!/bin/bash
rman target / << EOF
shutdown immediate;
startup mount;
backup spfile;
backup database;
alter database open;
delete noprompt obsolete;
quit;
EOF
Line 1 indicates that you are using the bash shell. Line 2 invokes Oracle Recovery Manager and specifies the OS user login to the target database (specified in the environmental variable $ORACLE_SID). The < following this indicates that the subsequent commands will be passed into RMAN to be processed. The EOF on the last line indicates that you have reached the end of the series of commands to be passed into RMAN. RMAN is then used to shut down the database, start up and mount the database, and proceed to back up the server parameter file and the contents of the database. The database is then opened. Any backups that are older than those specified in the retention policy are then deleted. See the RMAN documentation to construct a backup that is relevant for your situation.

Nightly backups are commonly scheduled and run automatically. The script above could be called and the contents of standard out sent to an e-mail address with the following command:

sh cold.sh | mail -s"Backup ‘date‘" administrator@yourcompany.com

Other Oracle utilities can be run from within shell scripts as well. The tnsping utility can be used to see if a given Oracle connection identifier can contact the listener. You might run this utility to check for connection problems:

tnsping ptch04 |grep TNS-
Database exports and imports (traditional and data pump) are also good candidates for the scripting of repeating processes.

Database Installation. Many of the steps involved in the setup of a database can be automated. Before you install Oracle 10g on Linux, you need to run various tests to verify the minimum required version of packages and the settings of kernel parameters. You can query the version of a package by using the rpm command with the –q option.

 
rpm -q compat-libstdc++

You can determine various aspects of the system by looking at the /proc “virtual” or “pseudo” file system. It contains not real files but, rather, runtime system information that can be viewed as if it resided in files. For instance, /proc/meminfo contains memory information for the system, and grep MemTotal /proc/meminfo displays the total memory of the system. By using awk as you did earlier, you could isolate the amount of memory in kilobytes, by using:

grep MemTotal /proc/meminfo | awk '{print $2}'
Such a command could be used in the context of a script that would do comparisons and respond accordingly (even updating the system itself). The sample scripts 10gchecks_kernel.sh and 10gchecks.sh simply display current and recommended versions and settings based on the Oracle documentation.

Database Monitoring. The ps command can be used to report process status and to check to see if a database, listener, script, or any other process of interest is running. If you want to list all the databases that are currently running on a server, you can run the following command:

 
echo "‘ps -ef | grep smon|grep -v grep|awk '{print $8}'| awk -F \"_\"
'{print$3}'‘"
Although this is functional, it is a bit difficult to understand at first glance. The first command, ps-ef options for a full listing of all processes), finds all of the processes running on the server. The next, grep, searches for SMON (the Oracle System Monitor background process), which indicates that the database is running. You want to remove entries that refer to the grepawk to locate the eighth column in the listing, which contains the system monitor process name in the form ora_smon_. The last instance of awk then uses the underscore character as the delimiter to search for and print the database name that owns the SMON process. The underscore character needs to appear in quotes, and the backslash is used before each of the quotes to escape those quotes (because the entire string appears within a set of double quotes). (using the command itself, which is being run. You then use

exec_sql.sh. As mentioned earlier, it is possible to query the database from a shell script provided the user has access to sqlplus. The following example returns a (space-delimited) list of machines that are currently maintaining sessions in the database:

 
#!/bin/bash

output=‘sqlplus -s "/ as sysdba" <
This script is similar to the previous RMAN script in that you are inputting commands into another program. A local OS authenticated connection is made to the database as sysdba. To prevent extraneous messages from being returned, this script turns off the heading, feedback, and verify options of SQL*Plus. The query is executed and SQL*Plus is exited. Note the double-backslash before the dollar sign in the view name. These are required escape sequences within the string: The first slash escapes the second slash, which escapes the dollar sign. Again, not pretty, but functional. As mentioned previously, if you are going to be writing something that requires extensive database access, shell scripting is not the best option. It might be better to rewrite the script in PL/SQL, Perl (which uses syntax similar to that used in shell scripting), Python, Java, or another language of your choice.

Conclusion

Shell scripting can be an effective tool for quickly automating repetitive and error-prone administration tasks. The examples in this article provide an introduction to the possibilities available but are far from comprehensive. Every system has distinct quirks and foibles and a unique configuration. An administrator will develop unique solutions to meet the needs of the particular system.

Sunday, June 10, 2007

Oracle 10g Vs PostgreSQL 8 vs MySQL 5

Full Disclosure: I am strongly biased towards Oracle and fully expected no real competition.

License: I will not get into a debate over open source and closed source. Oracle is a commercial database and requires licensing to use. MySQL and PostgreSQL are both open source projects. Oracle provides a free developers license for you to "test out" and create prototypes.

Configuration: I am installing on an older machine with 256 Megs of Ram and a 40 Gig hard drive. The CPU is a 633 Mhz Celeron. The OS is Windows 2000 Professional. I want to install at the low end of the spectrum to gauge the ability for home user-type installations.

OS: I chose to perform this comparison under Windows, as I believe that that is still the most common platform that new people to the world of databases will use. By that, I mean people installing at home. For new corporate users, the odds are good that they will not have a choice of either OS or database so a comparison is pointless.

Versions:

PostgreSQL 8.0 - PostgreSQL 8.0 is a very new product. I chose 8.0 as it is the latest version and it natively supports Windows. This is the first version that actually does support Windows without an emulator or third-party recompilation.

MySQL - MySQL 5.0 is also a very new product. As a matter of a fact, the version I used, v5.0.4, is actually a beta version. I argued with myself about whether I should use the stable v4 or the beta v5. I chose v5 because v4 does not compete with PostgreSQL or Oracle. V4 does not have stored procedures, triggers, views, etc. Without those features, I would not consider it as a contender. The upside to that is that v5 does support those features.

Oracle 10g - This is the latest version of Oracle and was released last year. In my opinion, Oracle 10g is the gold standard of databases. Nevertheless, I also recognize that it is very expensive and requires quite a bit of knowledge to support in a production environment.



Ranking Factors:
  • Documentation and Getting Started Support
  • Ease of installation
  • Ease of Verifying Successful Installation
  • Creation of Non-Admin User
  • Time to Run First Query
  • Resource Requirements

Ranking Values: I rank each factor a score of 1 through 10, with 10 being the best or highest.

Documentation and Getting Started Support

PostgreSQL - I found a lot of support for previous versions (under Linux, Cygwin, etc.), but almost nothing for 8.0 under Windows. The PostgreSQL documentation website, http://www.postgresql.org/docs/ , provides very good, very complete documentation. In my opinion, one of PostgreSQL's strengths in the past has been the documentation.

That web site includes a very good installation document. However, post installation and getting started documentation is very limited. I hope that that will change in time.

I googled "PostgreSQL getting started" and got thousands, mostly version 7, of hits. PostgreSQL has some Usenet newsgroup support, but I noticed they were not very active.

PostgreSQL Documentation Ranking: 5

MySQL - Because v5 is so new, there is not very much documentation yet. There is a world of documentation for MySQL v3 and v4. Check Amazon or BN.com. MySQL seems to be the choice for a lot of smaller Internet and java-oriented companies.

Because v5 is beta, I will not rank this factor according to the documentation available for that release. V4 has amazing documentation and I fully expect v5 to have the same by the time the software is released to production. I will rank according to v4 documentation.

I googled "MySQL getting started" and got millions, mostly version 4, of hits. I could find no MYSQL-specific Usenet newsgroup support, but I noticed some postings for MySQL support on the comp.databases group.

MySQL Documentation Ranking: 8

Oracle - Oracle, as a commercial product, has a large staff of full-time technical writers. Every feature is spelled out and many features receive their own books. The installation documentation provided by Oracle is superior and, where that is lacking, it is made up by third party sites dedicated to Oracle. The documentation, as well as the documentation portal, is very mature.

In addition to installation guides for just about every OS, Oracle provides a "2 day DBA" guide. This document is indispensable for new users.

I googled "Oracle getting started", and received millions of hits. Oracle also has robust Usenet newsgroup support.

Oracle Documentation Ranking: 10

Ease of Installation

PostgreSQL

I haven't previously used PostgreSQL outside of Cygwin (a Unix environment for Windows). I was surprised at the feel of the installation. It seemed very native and ran flawlessly. It was also a very quick install.


PostgreSQL installed the quickest. According to the documentation, you can install PostgreSQL without administrative rights. As this is the first version of PostgreSQL to support Windows (without Cygwin), I wanted to test the Services support. PostgreSQL installed as a service and came up without problems after rebooting.

PostgreSQL Installation Ranking: 10

MySQL

I ran a web site in the past, mainly for my own amusement, and used MySQL as the database. The technology stack I used was Java/Apache/MySQL. I was fully expecting MySQL to win the installation competition based on experience. However, MySQL was the only install to fail during installation. I performed a little cleanup and then re-ran the install. It failed again. I cleaned up again, rebooted and re-ran the installation. This time it installed. I'm not sure what caused the error. It was very cryptic but it was the same both times it failed. I was able to bulldoze past that, though. Remember, this is a beta release.

MySQL also installed as a service and on reboot came up with no errors. The install, when it ran correctly, ran quickly.

MySQL Installation Ranking: 5

Oracle

Oracle again proved that it is a very mature product. Of the three, Oracle is the only one that can NOT be installed without admin rights. It would be nice for this to be an option.

It did, however, install easily. I received no errors. Oracle was the only install to point me towards the next step at the end of the install. This is a sure fire method to support new users and requires almost no effort. All databases should provide this kind of support.

Where Oracle really took a hit was in the amount of time it took to install. This is a real issue for home users. There were a couple of spots where I thought the machine might be hung as I received no feedback for an extended amount of time.

Oracle Installation Ranking: 7

Ease of Verifying Successful Installation

To verify the success of the install was a little different, but all three provide data access tools. I followed the instructions in the provided documentation and was able to access the server in all three instances.

Verification Ranking for all 3: 10

Creation of Non-Admin User:

PostgreSQL: PostgreSQL uses the standard CREATE USER syntax and I was able to perform this task easily. The documentation walks the user through this task (which is required for security reasons).

PostgreSQL Create User Ranking: 10

MySQL: In v5, MySQL also uses the CREATE USER syntax. Prior releases did not. Due to the fact that the documentation is not yet complete; it is not completely obvious that CREATE USER is available. For a new user, this can be confusing.

MySQL Create User Ranking: 7

Oracle: Oracle uses the standard CREATE USER syntax and I was able to perform this task easily. The documentation walks the user through this task. Oracle Create User Ranking: 10

Time to Run First Query

All three database documentation sets provide a tutorial for getting connected and running your first query.

All three First Query Ranking: 10

Resource Requirements:

PostgreSQL - I was amazed at PostgreSQL's minimal need for resources. I performed some tests like opening web browsers and MS-Word while connecting remotely and could see no impact at all on my test machine.

I couldn't find the minimum hardware specs, but 256Meg on an old machine seems to be more than enough.

PostgreSQL Resource Ranking: 10

MySQL - MySQL's minimal need for resources is well known. That is one of its biggest selling points and always has been. I performed the same test that I preformed for PostgreSQL and MySQL performed as well as PostgreSQL.

Oddly enough, I couldn't find the minimum hardware specs for MySQL either, but 256Meg on an old machine seems to be more than enough.

MySQL Resource Ranking: 10

Oracle - Oracle uses the most memory, and Oracle 10g uses even more than previous versions. It really would run better on a 512M machine, and I would recommend 1gig.

On the test box, it takes forever (well, many long minutes) for Oracle to even start up. When I opened Word, I started getting swapping and stalling. With multiple remote connections, the database came to a standstill. Oracle will install and run in 256 Megs, but I don't recommend it.

Oracle Resource Ranking: 3

So how does each stack up? Here are the final rankings:

PostgreSQL: 55

MySQL: 50

Oracle: 50

So, that is the install and first use ranking. That is not at all what my expectations were. I think it's great that all three rated so high. I was not expecting that or that Oracle and MySQL would tie. I was definitely not expecting PostgreSQL to come in first.

However, installation is only a small piece of using a database. There is language support, administration, additional usability features, security and third-party support. In the near future, I will write up a comparison of these components.


Thursday, May 17, 2007

Why Raid?

Why do people think RAID means performance?

George Ou, Technical Director at ZDNet and a fellow ZDnet blogger, has a great post about real life RAID performance - hardware vs software - plus some helpful comments about data layout, especially for MS SQL Server. As George notes, data layout can have a major impact on storage performance. But what about RAID itself? What is the theory behind RAID performance?

RAID = Redundant Array of Inexpensive Disks

RAID wasn’t originally about performance, it was about cost. Some CompSci jocks at Berkeley wanted to use 5.25″ drives for a bunch of cheap storage because they couldn’t afford the then-common 9″ drives. The small drives had poor reliability as well as lower cost, so they cobbled them together to create the first RAID array.

But as they worked out the details, they saw that RAID could have performance advantages for certain workloads.

The three pillars of RAID performance

* Cache
* Striping
* Chunk size

Let’s look at all three.

Cache

Cache is simply RAM, or memory, placed in the data path in front of a disk or disk array. You can read or write RAM about 100,000 times faster than a fast disk and about 300,000 times faster than a slow disk. Writing or reading cache is a lot faster than disk.

Most external array controllers are redundant as well, with some kind of load balancing driver that keeps both controllers productive. In this case, the cache has to be dual-ported. If a controller fails, the other controller can see all the pending writes of the failed controller and completes them.

Dual-ported cache, controller failover, dual server interfaces and failover drivers are all tricky to engineer and test, which is one reason why mid-range and high-end controllers are so expensive. But they sure speed writes up.

Striping for speed

Striping is taking a virtual disk that the operating system sees, and spreading that virtual disk across several real, physical disks.

A RAID controller presents something that looks like a disk, like a C: drive, to the operating system, be it Windows, OS X or Linux. The RAID controller isn’t presenting a real disk. It is presenting a group of disks and making them look like a single disk to your computer.

The advantage is that instead of a single disk’s performance, you now have the performance of several disks. Instead of 50 I/Os per second (IOPS) on a 5400 RPM drive, you might have 150, 200 IOPS or more, depending on the number of drives. If you use fast 15k drives you might reach 900, 1,000 or more IOPS.

And instead of 1.5 gigabits per second bandwidth, you might have 4.5 or 6 Gb/sec. If you have a cache that you are in a hurry to empty, a nice fast stripe set is very helpful.

The hard part: spreading I/Os smoothly across all the disks. If they all jam up on one disk your costly storage system will be no faster than a single disk. That’s where chunk size comes in.

Chunk size: the hidden key to RAID performance

Stripes go across disk drives. But how big are the pieces of the stripe on each disk? The pieces a stripe is broken into are called chunks. Is the stripe broken into 1k byte pieces? Or 1 MB pieces? Or even larger? To get good performance you must have a reasonable chunk size.

So what is a reasonable chunk size? It depends on your average I/O request size. Here’s the rule of thumb: big I/Os = small chunks; small I/Os = big chunks.

Do you do video editing or a lot of Photoshop work? Then your average request size will be large and your performance will be dominated by how long it takes to get the data to or from the disks. So you want a lot of bandwidth to move data quickly. To get a lot of bandwidth you want each disk to shoulder part of the load, so you want a small chunk size. What is small? Anywhere from 512 bytes (one block) to 8 KB.

If you are running a database and doing lots of small I/Os - 512 bytes to 4 KB say - then you want to maximize your IOPS, which ideally means sending each I/O to only one disk and spreading the I/Os evenly across the disks. What you don’t want is a single I/O getting sent to two disks, since waiting for the heads will slow things down. So you want a large chunk size - at least 64 KB or more. That large chunk will mean that most I/Os get serviced by a single disk and more I/Os are available on the remaining disks.

However, many databases use their own strategies to gather I/Os to minimize I/O overhead. In that case you need to know what the database is actually doing to choose the right chunk size.

The Storage Bits take

RAID systems are complex and their operation is sometimes counter-intuitive. You should take the time to understand what your workload is in order to configure a RAID system for good performance. Otherwise you could end up with an expensive and slow problem instead of a fast I/O solution.

Internals Of Database

Storage

Database tables/indexes are typically stored in memory or on hard disk in one of many forms, ordered/unordered Flat files, ISAM, Heaps, Hash buckets or B+ Trees. These have various advantages and disadvantages discussed in further in the main article on this topic. The most commonly used are B+trees and ISAM.

Indexing

All of these databases can take advantage of indexing to increase their speed, and this technology has advanced tremendously since its early uses in the 1960s and 1970s. The most common kind of index is a sorted list of the contents of some particular table column, with pointers to the row associated with the value. An index allows a set of table rows matching some criterion to be located quickly. Typically, indexes are stored in the same various techniques mentioned above (such as B-trees, hashes, and linked lists). Usually, a specific technique is chosen by the database designer to increase efficiency in the particular case of the type of index required.

Relational DBMSs have the advantage that indexes can be created or dropped without changing existing applications making use of it. The database chooses between many different strategies based on which one it estimates will run the fastest. In other words, indexes are transparent to the application or end user querying the database; while they affect performance, any SQL command will run with or without indexes existing in the database.

Relational DBMSs utilize many different algorithms to compute the result of an SQL statement. The RDBMS will produce a plan of how to execute the query, which is generated by analyzing the run times of the different algorithms and selecting the quickest. Some of the key algorithms that deal with joins are Nested Loops Join, Sort-Merge Join and Hash Join. Which of these is chosen depends on whether an index exists, what type it is, and its cardinality.

Transactions and concurrency

In addition to their data model, most practical databases ("transactional databases") attempt to enforce a database transaction . Ideally, the database software should enforce the ACID rules, summarized here:

* Atomicity: Either all the tasks in a transaction must be done, or none of them. The transaction must be completed, or else it must be undone (rolled back).
* Consistency: Every transaction must preserve the integrity constraints — the declared consistency rules — of the database. It cannot place the data in a contradictory state.
* Isolation: Two simultaneous transactions cannot interfere with one another. Intermediate results within a transaction are not visible to other transactions.
* Durability: Completed transactions cannot be aborted later or their results discarded. They must persist through (for instance) restarts of the DBMS after crashes
* A cascading rollback occurs in database systems when a transaction (T1) causes a failure and a rollback must be performed. Other transactions dependent on T1's actions must also be rolled back due to T1's failure, thus causing a cascading effect.
In practice, many DBMS's allow most of these rules to be selectively relaxed for better performance.

Concurrency control is a method used to ensure that transactions are executed in a safe manner and follow the ACID rules. The DBMS must be able to ensure that only serializable, recoverable schedules are allowed, and that no actions of committed transactions are lost while undoing aborted transactions.

Replication

Replication of databases is closely related to transactions. If a database can log its individual actions, it is possible to create a duplicate of the data in real time. The duplicate can be used to improve performance or availability of the whole database system. Common replication concepts include:

* Master/Slave Replication: All write requests are performed on the master and then replicated to the slaves
* Quorum: The result of Read and Write requests are calculated by querying a "majority" of replicas.
* Multimaster: Two or more replicas sync each other via a transaction identifier.

Security

Database security is the system, processes, and procedures that protect a database from unintended activity.

Applications of databases

Databases are used in many applications, spanning virtually the entire range of computer software. Databases are the preferred method of storage for large multiuser applications, where coordination between many users is needed. Even individual users find them convenient, and many electronic mail programs and personal organizers are based on standard database technology. Software database drivers are available for most database platforms so that application software can use a common application programming interface (API) to retrieve the information stored in a database. Two commonly used database APIs are JDBC and ODBC.

Monday, May 14, 2007

Database Models

Various techniques are used to model data structure. Most database systems are built around one particular data model, although it is increasingly common for products to offer support for more than one model. For any one logical model various physical implementations may be possible, and most products will offer the user some level of control in tuning the physical implementation, since the choices that are made have a significant effect on performance. An example of this is the relational model: all serious implementations of the relational model allow the creation of indexes which provide fast access to rows in a table if the values of certain columns are known.

[edit] Flat model
This may not strictly qualify as a data model, as defined above. The flat (or table) model consists of a single, two-dimensional array of data elements, where all members of a given column are assumed to be similar values, and all members of a row are assumed to be related to one another.

[edit] Hierarchical model
In a hierarchical model, data is organized into a tree-like structure, implying a single upward link in each record to describe the nesting, and a sort field to keep the records in a particular order in each same-level list.

[edit] Relational model
Three key terms are used extensively in relational database models: relations, attributes, and domains. A relation is a table with columns and rows. The named columns of the relation are called attributes, and the domain is the set of values the attributes are allowed to take.
The basic data structure of the relational model is the table, where information about a particular entity (say, an employee) is represented in columns and rows (also called tuples). Thus, the "relation" in "relational database" refers to the various tables in the database; a relation is a set of tuples. The columns enumerate the various attributes of the entity (the employee's name, address or phone number, for example), and a row is an actual instance of the entity (a specific employee) that is represented by the relation. As a result, each tuple of the employee table represents various attributes of a single employee.
All relations (and, thus, tables) in a relational database have to adhere to some basic rules to qualify as relations. First, the ordering of columns is immaterial in a table. Second, there can't be identical tuples or rows in a table. And third, each tuple will contain a single value for each of its attributes.
A relational database contains multiple tables, each similar to the one in the "flat" database model. One of the strengths of the relational model is that, in principle, any value occurring in two different records (belonging to the same table or to different tables), implies a relationship among those two records. Yet, in order to enforce explicit integrity constraints, relationships between records in tables can also be defined explicitly, by identifying or non-identifying parent-child relationships characterized by assigning cardinality (1:1, (0)1:M, M:M). Tables can also have a designated single attribute or a set of attributes that can act as a "key", which can be used to uniquely identify each tuple in the table.
A key that can be used to uniquely identify a row in a table is called a primary key. Keys are commonly used to join or combine data from two or more tables. For example, an Employee table may contain a column named Location which contains a value that matches the key of a Location table. Keys are also critical in the creation of indexes, which facilitate fast retrieval of data from large tables. Any column can be a key, or multiple columns can be grouped together into a compound key. It is not necessary to define all the keys in advance; a column can be used as a key even if it was not originally intended to be one.

[edit] Relational operations
Users (or programs) request data from a relational database by sending it a query that is written in a special language, usually a dialect of SQL. Although SQL was originally intended for end-users, it is much more common for SQL queries to be embedded into software that provides an easier user interface. Many web sites, such as Wikipedia, perform SQL queries when generating pages.
In response to a query, the database returns a result set, which is just a list of rows containing the answers. The simplest query is just to return all the rows from a table, but more often, the rows are filtered in some way to return just the answer wanted. Often, data from multiple tables are combined into one, by doing a join. There are a number of relational operations in addition to join.

[edit] Normal Forms
Main article: Database normalization
Relations are classified based upon the types of anomalies to which they're vulnerable. A database that's in the first normal form is vulnerable to all types of anomalies, while a database that's in the domain/key normal form has no modification anomalies. Normal forms are hierarchical in nature. That is, the lowest level is the first normal form, and the database cannot meet the requirements for higher level normal forms without first having met all the requirements of the lesser normal forms.[1]

[edit] Object database models
In recent years, the object-oriented paradigm has been applied to database technology, creating a new programming model known as object databases. These databases attempt to bring the database world and the application programming world closer together, in particular by ensuring that the database uses the same type system as the application program. This aims to avoid the overhead (sometimes referred to as the impedance mismatch) of converting information between its representation in the database (for example as rows in tables) and its representation in the application program (typically as objects). At the same time, object databases attempt to introduce the key ideas of object programming, such as encapsulation and polymorphism, into the world of databases.
A variety of these ways have been tried for storing objects in a database. Some products have approached the problem from the application programming end, by making the objects manipulated by the program persistent. This also typically requires the addition of some kind of query language, since conventional programming languages do not have the ability to find objects based on their information content. Others have attacked the problem from the database end, by defining an object-oriented data model for the database, and defining a database programming language that allows full programming capabilities as well as traditional query facilities.

[edit] Post-relational database models
Several products have been identified as post-relational because the data model incorporates relations but is not constrained by the Information Principle, requiring that all information is represented by data values in relations. Products using a post-relational data model typically employ a model that actually pre-dates the relational model. These might be identified as a directed graph with trees on the nodes.
Examples of models that could be classified as post-relational are PICK aka MultiValue, and MUMPS, aka M.