Showing posts with label RAC. Show all posts
Showing posts with label RAC. Show all posts

Tuesday, May 24, 2016

Improving the performance of ORDER_LASTMOD_IDX on DCSPP_ORDER in ATG

I just stumbled on Metalink article 1603359.1, which recommends that ATG customers encountering contention on the ORDER_LASTMOD_IDX index on table DCSPP_ORDER drop the index. It's noted that this is especially likely in RAC and Exadata.

The indexed column, LAST_MODIFIED_DATE (a TIMESTAMP), frequently changes for an order and every order being changed will be adding or updating nearly the same value. This means that every session will be attempting to add data in the same index leaf, meaning there will be heavy competition for that block in RAC as each node attempts to capture it.

The index will also grow significantly faster than its data. Say an order goes through a series of ten updates; each will require that the "old" leaf be cleared of its value and a new one added further "up" the index (we're talking about a timestamp after all). An order updated ten times will leave nine empty leaves behind, which will never be reclaimed since a record will never be updated to a past time.

This can certainly be dealt with by rebuilding the index periodically, but it doesn't solve the contention issue. For shops using Standard Edition, rebuilding an index comes at the cost of locking the table, effectively causing an outage during the maintenance. Dropping the index comes at the expense of poor performance of some ATG queries that search on LAST_MODIFIED_DATE.

But there's another solution—rebuilding the index with the REVERSE keyword.

A reverse key on the index will spread activity across ten "hot" blocks instead of just one. That's because, instead of entering a timestamp based on the incremental time with the least volatile components (Year, month, day) occupying the most volatile spots (the leading edge of the index) a reverse index on this column will put the most volatile portion first (the seconds/milliseconds) and the least volatile last.

The index also won't fragment, either, since the "empty" spots in the index can be reused. Under the default configuration we would see this index grow to 11G or more in a matter of months, when the table itself was under 3G. Since converting the index to be a reverse key, it's remained at a much more stable (and manageable) 1G despite the table growing to over 5G.

Waits, in particular Global Cache waits associated with RAC attempting to obtain a cached block from another node, virtually disappeared for us.

The only downside is that any BETWEEN condition against the LAST_MODIFIED_DATE will not use the index, since a reverse key index can't be used for a range scan.

Friday, March 11, 2016

A CRS Action Script for Solarwinds Database Performance Analyzer (DPA)

I've been using Solarwinds DPA for a few years now and find it to be a great tool for monitoring Oracle databases. It's one of the few GUI tools I actually enjoy working with. The interface is clean and provides great information without being cluttered and the navigation is intuitive.

We first installed DPA on a trial basis. It went on a box that, well, let's just say it was a place I could install some trial software without having to put a lot of effort into things. It worked out and we licensed it. In place. Over time, we've become more reliant on it for monitoring our Oracle databases. But it was still on a junker, single node database server.

A junker, single node database server that was about to go out of support.

We decided to move it to a RAC cluster. We'd get gaps in our monitoring when we applied database patches, or when the node had to be rebooted, or when the node just took a dump and RAC seemed like a nice solution.

DPA is really just a tomcat application that uses an Oracle database as a repository. With Grid Infrastructure it's possible to manage a resource like tomcat within CRS. This would mean DPA could be dependent on its database resource, start automatically on a server reboot, and on node failure it would relocate to another node.

For those that don't know, CRS manages resource through an action script. This is just a shell script that CRS runs that has options for starting, stopping, checking and cleaning a resource.

I figured Solarwinds would have a canned action script. Turns out they didn't, so I rolled my own. Here it is for anyone that's interested.

First is the action script itself:

 #!/bin/sh  
   
 # Startup, shutdown, clean and check script for SolarWinds DPA (Database Performance Analyzer) processes.  
   
 . /oracle/dpa/setenv.sh  
 logfile=$DPA_LOGS/dpactl.log  
   
 usage()  
 {  
  echo "Usage: $0 start|stop|check|clean"  
  exit 1  
 }  
   
 check_dpa()  
 {  
  check=$(ps -ef | grep $DPA_HOME | grep tomcat | grep -v grep | wc -l)   
  if [ $check -eq 1 ]  
   then exit 0  
   else exit 1  
  fi  
 }  
   
 getnow()  
 {  
  now=$(date '+%m/%d/%y %H:%M:%S')  
 }  
   
 case "$1" in  
    'start') getnow  
        echo "Starting Solarwinds DPA at ${now}." | tee -a $logfile  
        echo "This will take a few moments. Please be patient..." | tee -a $logfile  
        $DPA_HOME/startup.sh | tee -a $logfile  
   
        check_dpa  
        if [ $check -eq 1 ]  
         then getnow  
            echo "Solarwinds DPA started successfully at ${now}." | tee -a $logfile  
         else echo "There was a problem starting Solarwinds DPA!" | tee -a $logfile  
        fi  
    ;;  
   
    'stop') getnow  
        echo "Stopping Solarwinds DPA at ${now}." | tee -a $logfile  
        echo "This will take a few moments. Please be patient..." | tee -a $logfile  
        $DPA_HOME/shutdown.sh | tee -a $logfile  
        EXITCODE=$?  
        check_dpa  
   
        if [ $EXITCODE == 1 ] || [ $check -gt 0 ]  
         then error "There was a problem stopping Solarwinds DPA!" | tee -a $logfile  
        fi  
   
        getnow  
        echo "Solarwinds DPA stopped successfully at ${now}." | tee -a $logfile  
        exit 0  
    ;;  
   
    'check') check_dpa  
    ;;  
   
    'clean') dpa=$(ps -ef | grep $DPA_HOME | grep tomcat | grep -v grep | awk '{print $2}')  
        if [ ! -z $dpa ]   
         then kill -9 -f $dpa  
            exit 0  
         else exit 1  
        fi  
    ;;  
   
    *) usage  
    ;;  
   
 esac  

Next is a script that sets the environment, called at the beginning of the action script. It avoids having to hard-code directory paths into the action script itself.

 export DPA_HOME=/oracle/dpa/dpa_10_1_313  
 export DPA_LOGS=/oracle/dpa/logs  

Finally, there are the commands to create the Grid/CRS resource, as well as starting, stopping and relocating it.

 $CRS_HOME/bin/crsctl add resource solarwinds_DPA -type cluster_resource -attr "ACTION_SCRIPT=/path/to/script/solarwinds_dpa.sh, DESCRIPTION='SolarWinds Database Performance Analyzer', DEGREE=1, ENABLED=1, AUTO_START=always, START_TIMEOUT=0, CHECK_INTERVAL=60, STOP_TIMEOUT=0, SCRIPT_TIMEOUT=60, RESTART_ATTEMPTS=2, OFFLINE_CHECK_INTERVAL=0, START_DEPENDENCIES='hard(ora.repo.db)', ACL='owner:oracle:rwx,pgrp:oinstall:rwx,other::r--', CARDINALITY=1, FAILURE_INTERVAL=0, FAILURE_THRESHOLD=0"  
 $CRS_HOME/bin/crsctl start resource solarwinds_DPA  
 $CRS_HOME/bin/crsctl relocate resource solarwinds_DPA -n node2  

You'll need to install DPA on all nodes of the cluster, or in a shared directory. You'll also need a directory for the startup/shutdown logs.