RAC - Performance

I have already discussed basic Oracle tuning, in this section I will mainly dicuss Oracle RAC tuning. First lets review the best pratices of a Oracle design regarding the application and database

Optimize connection management, ensure that the middle tier and programs that connect to the database are efficent in connection management and do not log on or off repeatedly
Tune the SQL using the available tools such as ADDM and SQL Tuning Advisor
Ensure that applications use bind variables, cursor_sharing was introduced to solve this problem
Use packages and procedures (because they are compiled) in place of anonymous PL/SQL blocks and big SQL statements
Use locally managed tablespaces and automatic segment space management to help performance and simplify database administration
Use automatic undo management and temporary tablespace to simplify administration and increase performance
Ensure you use large caching when using sequences, unless you cannot afford to lose sequence during a crash
Avoid using DDL in production, it increases invalidations of the already parsed SQL statements and they need to be recompiled
Partion tables and indexes to reduce index leaf contention (buffer busy global cr problems)
Optimize contention on data blocks (hot spots) by avoiding small tables with too many rows in a block

Now we can review RAC specific best practices

Consider using application partitioning (see below)
Consider restricting DML-intensive users to using one instance, thus reducing cache contention
Keep read-only tablespaces away from DML-intensive tablespaces, they only require minimum resources thus optimizing Cache Fusion performance
Avoid auditing in RAC, this causes more shared library cache locks
Use full tables scans sparingly, it causes the GCS to service lots of block requests, see table v$sysstat column "table scans (long tables)"
if the application uses lots of logins, increase the value of sys.audsess$ sequence

Partitioning Workload

Workload partitioning is a certian type of workload that is executed on an instance, that is partitioning allows users who access the same set of data to log on to the same instance. This limits the amount of data that is shared between instances thus saving resources used for messaging and Cache Fusion data block transfer.

You should consider the following when deciding to implement partitioning

If the CPU and private interconnects are of high performance then there is no need to to partition
Partitioning does add complexity, thus if you can increase CPU and the interconnect performance the better
Only partition if performance is betting impacted
Test both partitioning and non-partitioning to what difference it makes, then decide if partitioning is worth it

RAC Wait Events

An event is an operation or particular function that the Oracle kernel performs on behalf of a user or a Oracle background process, events have specific names like database event. Whenever a session has to wait for something, the wait time is tracked and charged to the event that was associated with that wait. Events that are associated with all such waits are known as wait events. The are a number of wait classes

Commit
Scheduler
Application
Configuration
User I/O
System I/O
Concurrency
Network
Administrative
Cluster
Idle
Other

There are over 800 different events spread across the above list, however you probably will only deal with about 50 or so that can improve performance.

When a session requests access to a data block it sends a request to the lock master for proper authorization, the request does not know if it will receive the block via Cache Fusion or a permission to read from the disk. Two placeholder events

global cache cr request (consistent read - cr)
global cache curr request (current - curr)

keep track of the time a session spends in this state. There are number of types of wait events regarding access to a data block

Wait Event	Contention type	Description
gc current block 2-way	write/write	an instance requests authorization for a block to be accessed in current mode to modify a block, the instance mastering the resource receives the request. The master has the current version of the block and sends the current copy of the block to the requestor via Cache Fusion and keeps a Past Image (.PI) If you get this then do the following Analyze the contention, segments in the "current blocks received" section of AWR Use application partitioning scheme Make sure the system has enough CPU power Make sure the interconnect is as fast as possible Ensure that socket send and receive buffers are configured correctly
gc current block 3-way	write/write	an instance requests authorization for a block to be accessed in current mode to modify a block, the instance mastering the resource receives the request and forwards it to the current holder of the block, asking it to relinquish ownership. The holding instance sends a copy of the current version of the block to the requestor via Cache Fusion and transfers the exclusive lock to the requesting instance. It also keeps a past Image (PI). Use the above actions to increase the performance
gc current block 2-way	write/read	The difference with the one above is that this sends a copy of the block thus keeping the current copy.
gc current block 3-way	write/read	The difference with the one above is that this sends a copy of the block thus keeping the current copy.
gc current block busy	write/write	The requestor will eventually get the block via cache fusion but it is delayed due to one of the following The block was being used by another session on another session was delayed as the holding instance could not write the corresponding redo record immediately If you get this then do the following Ensure the log writer is tuned
gc current buffer busy	local	This is the same as above (gc current block busy), the difference is that another session on the same instance also has requested the block (hence local contention)
gc current block congested	none	This is caused if heavy congestion on the GCS, thus CPU resources are stretched

Enqueue Tuning

Oracle RAC uses a queuing mechanism to ensure proper use of shared resources, it is called Global Enqueue Services (GES). Enqueue wait is the time spent by a session waiting for a shared resource, here are some examples of enqueues:

updating the control file (CF enqueue)
updating an individual row (TX enqueue)
exclusive lock on a table (TM enqueue)

Enqueues can be managed by the instance itself others are used globally, GES is responsible for coordinating the global resources. The formula used to calculate the number of enqueue resources is as below

GES Resources = DB_FILES + DML_LOCKS + ENQUEUE_RESOURCES + PROCESS + TRANSACTION x (1 + (N - 1)/N)

N = number of RAC instances

displaying enqueues stats

SQL> column current_utilization heading current
SQL> column max_utilization heading max_usage
SQL> column initial_allocation heading initial
SQL> column resource_limit format a23;

SQL> select * from v$resource_limit;

AWR and RAC

I have already discussed AWR in a single instance environment, so for a quick refresh take a look and come back here to see how you can use it in a RAC environment.

From a RAC point of view there are a number of RAC-specific sections that you need to look at in the AWR, in the report section is a AWR of my home RAC environment, you can view the whole report here.

RAC AWR Section	Report	Description
Number of Instances	instances	lists the number of instances from the beginning and end of the AWR report
Instance global cache load profile	global cache	information about the interinstance cache fusion data block and messaging traffic, because my AWR report is lightweight here is a more heavy used RAC example Global Cache Load Profile ~~~~~~~~~~~~~~~~~~~~~~~~~ Per Second Per Transaction --------------- --------------- Global Cache blocks received: 315.37 12.82 Global Cache blocks served: 240.30 9.67 GCS/GES messages received: 525.16 20.81 GCS/GES messages sent: 765.32 30.91 The first two statistics indicate the number of blocks transferred to or from this instance, thus if you are using a 8K block size Sent: 240 x 8,192 = 1966080 bytes/sec = 2.0 MB/sec Received: 315 x 8,192 = 2580480 bytes/sec = 2.6 MB/sec to determine the amount of network traffic generated due to messaging you first need to find the average message size (this was 193 on my system) select sum(kjxmsize * (kjxmrcv + kjxmsnt + kjxmqsnt)) / sum((kjxmrcv + kjxmsnt + kjxmqsnt)) "avg Message size" from x$kjxm where kjxmrcv > 0 or kjxmsnt > 0 or kjxmqsnt > 0; then calculate the amount of messaging traffic on this network 193 (765 + 525) = 387000 = 0.4 MB to calculate the total network traffic generated by cache fusion = 2.0 + 2.6 + 0.4 = 5 MBytes/sec = 5 x 8 = 40 Mbits/sec The DBWR Fusion writes statistic indicates the number of times the local DBWR was forced to write a block to disk due to remote instances, this number should be low.
Glocal cache efficiency percentage	global cache efficiency	this section shows how the instance is getting all the data blocks it needs. The best order is the following Local cache Remote cache Disk The first two give the cache hit ratio for the instance, you are looking for a value less than 10%, if you are getting higher values then you may consider application partitioning.
GCS and GES - workload characteristics	GCS and GES workload	this section contains timing statistics for global enqueue and global cache. As a general rule you are looking for All timings related to CR (Consistent Read) processing block should be less than 10 msec All timings related to CURRENT block processing should be less than 20 msec
Messaging statistics	messaging	The first section relates to sending a message and should be less than 1 second. The second section details the breakup of direct and indirect messages, direct messages are sent by a instance foreground or the user processes to remote instances, indirect are messages that are not urgent and are pooled and sent.
Service statistics	Service stats	shows the resources used by all the service instance supports
Service wait class statistics	Service wait class	summarizes waits in different categories for each service
Top 5 CR and current block segements	Top 5 CR and current blocks	conatns the names of the top 5 contentious segments (table or index). If a table or index has a very high percentage of CR and Current block transfers you need to investigate. This is pretty much like a normal single instance.

Cluster Interconnect

As I stated above the interconnect it a critical part of the RAC, you must make sure that this is on the best hardware you can buy. You can confirm that the interconnect is being used in Oracle 9i and 10g by using the command oradebug to dump information out to a trace file, in Oracle 10g R2 the cluster interconnect is also contained in the alert.log file, you can view my information from here.

interconnect

SQL> oradebug setmypid
SQL> oradebug ipc

Note: look in the user_dump_dest directory, the trace will be there