Lighty Purchasing Adventure

November 16, 2015, 1:49 pm

≫ Next: Dynamic Sampling Killing Me in 12c

≪ Previous: Addnode resolv.conf Failures

Those that follow me on Twitter and in this blog know that I really like Orachrome’s Lighty for performance tuning. I prefer this over Enterprise Manager. I’ve gotten to the point where I get depressed if Lighty isn’t available to me and I am forced to use EM in its place. Don’t get me wrong, EM is a fine product and I still recommend it and EM does so much more than performance tuning. But for watching and troubleshooting database performance problems, Lighty is one of the top tools in my list!

It was quite an adventure getting the product licensed at my company. I figured I’d describe our adventure. While the adventure was ongoing, I had to resort to using Lighty after my trial expired, which was a bit painful at times. Thankfully this is all over, but here’s a recap of how the trip unfolded.

The first twist to purchasing Lighty, was that the quote came to us in Euros. SETRA Conseil, who bought Orachrome, is a French company. So its natural they would quote in Euros. Their web site lists prices in both Euros and US Dollars. I needed the quote in the latter and it took an extra day to obtain the quote. We’ll dock Lighty one point since they couldn’t read my mind and know exactly what I wanted. Lighty -1.

Next, we received an invoice and Lighty wanted us to perform a wire transfer, after which they would send us the license keys. Apparently, my company cannot perform wire transfers here in the US where I’m based. So they sent it to our parent company, which sent the invoice to an accounting department on the other side of the planet. I’m not sure what happened, but our accounting department was unable to successful perform a wire transfer. My Company -1.

The fine folks at Lighty did not have a direct method to accept a credit card payment. Lighty -1. (now down two points)

However, they were more than willing to work with us, so they had a way for us to pay via credit card through Pay Pal. Light +1 (now down only 1 point. We’re both tied at -1).

My company had problems paying via credit card on Pay Pal. My Company -1 (now down two points).

The fine folks at Lighty offered us an extra license for our troubles. Lighty +1 (they are now holding steady at zero…neither up nor down…).

My company was finally able to sort out the issues with Pay Pal. Within an hour of my purchasing agent letting me know the transaction was complete, Lighty had shipped me the license. Lighty +1 (they are now on the positive side).

So it was a bit of an adventure. I’m not in accounting so I’m not sure why there was so much hassle. Life could have been made easier if Lighty could have accepted a credit card payment directly, but this ended up not being a workable situation.

So in the final score, I have Lighty +1 and My Company -2.

But if you factor in the great product I am now fully licensed for, I think its Lighty +999 My Company -2.

↧

Dynamic Sampling Killing Me in 12c

November 16, 2015, 2:10 pm

≫ Next: cd: -M: invalid option

≪ Previous: Lighty Purchasing Adventure

Eleven days ago, I blogged about how Adaptive Dynamic Stats was consuming resources in my production RAC databases.

After putting out that fire, I was on to examine some poorly performing queries being reported by our QA people in Test and other non-production databases. I did as any good Oracle DBA would do. I gathered a stored procedure call that duplicated the problem. In my session, I started a SQL trace and ran the stored procedure. It took 50 seconds to complete, when it used to take 5 seconds or less before I upgraded from 11.2.0.4 to 12.1.0.2. This stored procedure contains a number of SQL statements and a SQL trace seemed like a logical place to start. I needed to know which SQL statement in the procedure was causing the problems.

I ran the SQL trace file through TKPROF and was surprised by the results. The SQL statements in the stored procedure seemed to be executing pretty quickly. But I was greeted by many statements similar to the following:

SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring
 optimizer_features_enable(default) no_parallel */ SUM(C1)
FROM
 (SELECT /*+ qb_name("innerQuery") INDEX_FFS( "XXX"
 "INDEX_NAME") */ 1 AS C1 FROM
 "OWNER"."TABLE_NAME" SAMPLE BLOCK(71.048, 8) SEED(1)
 "XXX") innerQuery

This is Dynamic Sampling at work. In looking at all of the Dynamic Sampling statements being executed in my trace file, I was able to determine that these accounted for 45 seconds of the overall runtime! Yikes!

Dynamic Sampling is supposed to help me out. The time spent obtaining some sample statistics is supposed to be much smaller than the amount of time saved by executing the SQL statement with better stats. If it doesn’t, your SQL statement performance can suffer, as was my case.

I noted one thing that I thought was interesting was that these Dynamic Sampling queries were executed once for each table and once for each of its indexes. One of the tables involved in my query has 7 indexes on it, so for that one table, I had 8 Dynamic Sampling queries!

In my blog post 11 days ago, I had set the optimizer_dynamic_sampling parameter to 0, which stops these queries from being executed. I had not yet put that change into our Test environment so I needed to do so. As soon as I did, query performance returned to normal. The default value of this parameter for my database is 2. Your default value can different depending on the value of the optimizer_features_enable setting. According to this blog post, a value of 2 means that dynamic sampling will kick in when at least one of the tables has no statistics. But to be honest, dynamic sampling isn’t giving me any benefits and only causes me harm. So I’ll just leave it off in its entirety for now.

↧

cd: -M: invalid option

November 30, 2015, 12:53 pm

≫ Next: SQLT in 12c Can’t Gather Stats

≪ Previous: Dynamic Sampling Killing Me in 12c

I’m trying to clean up trace files on one of my RAC testbeds. Oracle Corp was gracious enough to name the database “-MGMTDB” for me to give me a nice challenge (dripping with sarcasm). Here I am in my DIAGNOTIC_DEST and we can see two databases.

[oracle@host01 trace]$ cd /u01/app/oracle/diag/rdbms
[oracle@host01 rdbms]$ ls -l
total 8
drwxr-x--- 3 oracle oinstall 4096 Jun 17 14:07 _mgmtdb
drwxr-x--- 3 oracle oinstall 4096 Aug 10 13:13 resp

The directory ‘resp’ is for my Research Primary database, a testbed. The first entry is for the Cluster Health Monitor (CHM) repository database on my Grid Infrastructure 12.1.0.2 system. I can change directory easily enough.

[oracle@host01 rdbms]$ cd _mgmtdb
[oracle@host01 _mgmtdb]$ ls -l
total 4
-rw-r----- 1 oracle oinstall 0 Jun 17 14:07 i_1.mif
drwxr-x--- 16 oracle oinstall 4096 Jun 17 14:06 -MGMTDB

But now I have trouble with the next ‘cd’ command.

[oracle@host01 _mgmtdb]$ cd -MGMTDB
-bash: cd: -M: invalid option
cd: usage: cd [-L|-P] [dir]

To get around that, I need to use “dot-slash” before the directory name.

[oracle@host01 _mgmtdb]$ cd ./-MGMTDB
[oracle@host01 -MGMTDB]$ cd trace

Now like any other Oracle trace directory, I have lots of .trc and .trm files, similar to these:

-rw-r----- 1 oracle oinstall 21301 Nov 30 13:43 -MGMTDB_vktm_5472.trc
-rw-r----- 1 oracle oinstall 1946 Nov 30 13:43 -MGMTDB_vktm_5472.trm

So how to remove them? I get an error because ‘rm’ thinks that “-M’ is a parameter.

[oracle@host01 trace]$ rm *.trc *.trm
rm: invalid option -- M
Try `rm ./-MGMTDB_ckpt_5494.trc' to remove the file `-MGMTDB_ckpt_5494.trc'.
Try `rm --help' for more information.

The trick is to use “–” to tell the command line that what follows is no longer a list of parameters.

[oracle@host01 trace]$ rm -- *.trc *.trm

Life would have been such much easier if Oracle would have remembered that almost everyone runs Oracle on *nix with these silly parameters that also start with a dash.

↧

SQLT in 12c Can’t Gather Stats

December 2, 2015, 12:51 pm

≫ Next: Fast Split Partitioning

≪ Previous: cd: -M: invalid option

After upgrading to 12c, I did have some issues where processing in our database was running into the following errors:

ORA-20000: Unable to gather statistics concurrently: insufficient privileges

The fix was pretty easy. I found information on Tim’s Oracle Base website on the workaround here: https://oracle-base.com/articles/12c/concurrent-statistics-collection-12cr1

Today I tried to run SQLT’s sqlxtract script to help tune a problem SQL statement. I was surprised when it failed early on. I checked the log and found that SQLT was running into this same issue. The workaround was the same in that I just granted the following:

CREATE JOB

MANAGE SCHEDULER

MANAGE ANY QUEUE

I granted these system privileges to both SQLTEXPLAIN and SQLTXADMIN.

↧

Fast Split Partitioning

December 8, 2015, 2:09 pm

≫ Next: LongOpsWatcher in SQL Dev

≪ Previous: SQLT in 12c Can’t Gather Stats

I have a partitioned table for some application logging. A few years ago, I partitioned the table with one partition per month. As we near 2016, its time for me to add partitions for the new year. The partitioned table has, as its last two partitions, the partition for December 2015 and a partition using MAXVALUE. I never plan on having any data in the MAXVALUE partition. It is just there for making SPLIT PARTITION operations easier.

In the past, I would add partitions with commands similar to the following:

ALTER TABLE usage_tracking

SPLIT PARTITION usage_tracking_pmax AT (TO_DATE('02/01/2016 00:00:00','MM/DD/YYYY HH24:MI:SS))

INTO (PARTITION usage_tracking_p201601, PARTITION usage_tracking_pmax);

ALTER TABLE usage_tracking

SPLIT PARTITION usage_tracking_pmax AT (TO_DATE('03/01/2016 00:00:00','MM/DD/YYYY HH24:MI:SS'))

INTO (PARTITION usage_tracking_p201602, PARTITION usage_tracking_pmax);

The SQL statements above will split the MAXVALUE partition into two partitions. There are 12 such commands, one for each month.

This year, when I tried to run the script for 2016 in a non-production environment, I was surprised to find these commands took about 30 minutes for each to complete. In previous years, they completed in seconds. Remember that USAGE_TRACKING_PMAX is empty so no data needs to be moved into an appropriate partition.

In analyzing the activity of my session performing the SPLIT, I could clearly see db file wait events which were tracked to this partitioned table. It was obvious that the SPLIT operation was reading the max partition, even though it was empty.

Previous years worked fine, but this database was recently upgraded to Oracle 12c. I found information on how to perform a fast split partition operation in MOS Note 1268714.1 which says this applies to Oracle 10.2.0.3 and higher, but I did not have any issues in 11.2.0.4. It was probably just dumb luck and I don’t have an 11g database to check this out on as all of mine have been upgraded. As such, rather than focusing on what changed, I’ll just address the problem and get on with my day.

Per the MOS note, to perform a fast split partition on this empty partition, I need to make sure that I have stats on the empty partition.

I confirmed that the NUM_ROWS was 0 for this empty partition. So I didn’t have to calculate stats on the partition. My first SPLIT PARTITION operation was very fast, just a few seconds. The partition was empty and Oracle knew it. What surprised me was that the new partition, USAGE_TRACKING_P201601 and USAGE_TRACKING_PMAX went to NULL values for statistics. This meant that performing the SPLIT PARTITION operation for the second new partition would take a long time. Here is an example of what I mean. First, we can see 0 rows in the max value partition.

SQL> select num_rows from dba_tab_partitions

  2  where partition_name='USAGE_TRACKING_PMAX';

  NUM_ROWS

----------

Now I’ll split that partition.

SQL> ALTER TABLE usage_tracking

  2  SPLIT PARTITION usage_tracking_pmax AT ( TO_DATE('02/01/2016 00:00:00','MM/DD/YYYY HH24:MI:SS') )

  3  INTO (PARTITION usage_tracking_p201601, PARTITION usage_tracking_pmax);

Table altered.

Elapsed: 00:00:03.13

Notice now that the last two partitions now have no stats.

SQL> select num_rows from dba_tab_partitions

  2  where partition_name='USAGE_TRACKING_PMAX';

  NUM_ROWS

----------

SQL> select num_rows from dba_tab_partitions

  2  where partition_name='USAGE_TRACKING_P201601';

  NUM_ROWS

----------

With no stats, the next split partition to create the February 2016 partition takes a long time.

SQL> ALTER TABLE nau_system.usage_tracking

  2  SPLIT PARTITION usage_tracking_pmax AT (TO_DATE('03/01/2016 00:00:00','MM/DD/YYYY HH24:MI:SS'))

  3  INTO (PARTITION usage_tracking_p201602, PARTITION usage_tracking_pmax);

Table altered.

Elapsed: 00:27:41.09

As the MOS note says, we need the stats on the partition to perform a fast split operation. The solution is to calculate stats on the partition, and then use one ALTER TABLE command to create all the partitions at once.

BEGIN

 DBMS_STATS.gather_table_stats (tabname=>'USAGE_TRACKING',

 partname => 'USAGE_TRACKING_PMAX',

 granularity => 'PARTITION');

 END;

ALTER TABLE usage_tracking

SPLIT PARTITION usage_tracking_pmax INTO

 (PARTITION usage_tracking_p201601 VALUES LESS THAN (TO_DATE('02/01/2016 00:00:00','MM/DD/YYYY HH24:MI:SS')),

  PARTITION usage_tracking_p201602 VALUES LESS THAN (TO_DATE('03/01/2016 00:00:00','MM/DD/YYYY HH24:MI:SS')),

  PARTITION usage_tracking_p201603 VALUES LESS THAN (TO_DATE('04/01/2016 00:00:00','MM/DD/YYYY HH24:MI:SS')),

  PARTITION usage_tracking_p201604 VALUES LESS THAN (TO_DATE('05/01/2016 00:00:00','MM/DD/YYYY HH24:MI:SS')),

  PARTITION usage_tracking_p201605 VALUES LESS THAN (TO_DATE('06/01/2016 00:00:00','MM/DD/YYYY HH24:MI:SS')),

  PARTITION usage_tracking_p201606 VALUES LESS THAN (TO_DATE('07/01/2016 00:00:00','MM/DD/YYYY HH24:MI:SS')),

  PARTITION usage_tracking_p201607 VALUES LESS THAN (TO_DATE('08/01/2016 00:00:00','MM/DD/YYYY HH24:MI:SS')),

  PARTITION usage_tracking_p201608 VALUES LESS THAN (TO_DATE('09/01/2016 00:00:00','MM/DD/YYYY HH24:MI:SS')),

  PARTITION usage_tracking_p201609 VALUES LESS THAN (TO_DATE('10/01/2016 00:00:00','MM/DD/YYYY HH24:MI:SS') ),

  PARTITION usage_tracking_p201610 VALUES LESS THAN (TO_DATE('11/01/2016 00:00:00','MM/DD/YYYY HH24:MI:SS') ),

  PARTITION usage_tracking_p201611 VALUES LESS THAN (TO_DATE('12/01/2016 00:00:00','MM/DD/YYYY HH24:MI:SS') ),

  PARTITION usage_tracking_p201612 VALUES LESS THAN (TO_DATE('01/01/2017 00:00:00','MM/DD/YYYY HH24:MI:SS') ),

  PARTITION usage_tracking_pmax);

If I would have left the script to performing 12 individual SPLIT PARTITION operations, then I would have needed to recalculate stats on the max partition between each one. Using one command was more efficient.

↧

LongOpsWatcher in SQL Dev

December 11, 2015, 9:02 am

≫ Next: A Trip Through the GIMR

≪ Previous: Fast Split Partitioning

I saw a video of someone who used the new command line utility, sqlcl to create a bar graph showing the progress of long operations in Oracle, as seen from V$SESSION_LONGOPS. That video inspired me to do something similar in SQL Developer.

Below is a video of LongOpsWatcher in action. You can see the time remaining. It calculates the completion percentage and includes a bar chart. I selected a 5 second refresh rate.

There is no way for this SQL Developer report to automatically launch the report with a non-zero refresh rate. Maybe that will come in a future version. I filled out an enhancement request and I’ve been told others have offered a similar suggestion.

Here is the SQL statement used in this SQL Developer report:

select inst_id,sid,message,time_remaining,to_char((sofar/totalwork)*100,'990.00') as pct_complete,
'SQLDEV:GAUGE:0:100:0:100:'||nvl(trunc((sofar/totalwork)*100,2),0) as pct_bar
from gv$session_longops
where time_remaining>0

Feel free to modify to suit your needs.

↧

A Trip Through the GIMR

December 22, 2015, 9:14 am

≫ Next: SQL Developer 4.1.3 Released

≪ Previous: LongOpsWatcher in SQL Dev

Oracle Grid Infrastructure includes the Cluster Health Monitor (CHM) which regularly captures OS-related performance information. In early versions, CHM use a Berkeley DB for its data store. In Grid Infrastructure 12.1.0.2, it is now required to use an Oracle database for the data store. This Oracle database is called the Grid Infrastructure Management Repository (GIMR). Many people are already aware that the GIMR runs with the database name “-MGMTDB” and runs on only one node of the GI cluster. Should that node become available, GI will automatically start the GIMR on a remaining node.

The above paragraph is about all of the background information I am going to provide on the GIMR. If the reader wants to know more, they can certainly do a web search for info on how to manage (what little management is needed of this database), and how to start and stop the database and its dedicated listener.

This blog post intends to educate the reader on how to to access the GIMR database and extract meaningful information from it. More web searching can show how to use command line utilities to export data from the GIMR. And there is a graphical utility, CHMOSG, that can be used to view the CHM data in the repository. But just for fun, I thought I would show how to get to the data directly.

First, you need to know which node the database is running on. On any node, I can issue the following:

[oracle@host01 bin]$ cd /u01/app/crs12.1.0.2
[oracle@host01 bin]$ ./crs_stat -t | grep -i mgmt
ora.MGMTLSNR ora....nr.type ONLINE ONLINE host01 
ora.mgmtdb ora....db.type ONLINE ONLINE host01

The above shows the database and the listener are running on host01. Now that I know the instance’s node, I can sign on to that node, and set my environment variables to connect to the instance. This database runs out of the Grid Infrastructure home, not the RDBMS home. So I need to set my ORACLE_HOME correctly. Also, the instance name starts with a dash so I need to wrap the SID in double quotes.

[oracle@host01 ~]$ export ORACLE_HOME=/u01/app/crs12.1.0.2
[oracle@host01 ~]$ export PATH=$ORACLE_HOME/bin:$PATH
[oracle@host01 ~]$ export ORACLE_SID="-MGMTDB"

I can now connect to the instance and verify I am connected to the proper one.

[oracle@host01 ~]$ sqlplus /nolog

SQL*Plus: Release 12.1.0.2.0 Production on Mon Dec 21 15:17:21 2015

Copyright (c) 1982, 2014, Oracle. All rights reserved.

SQL> connect / as sysdba
Connected.
SQL> select instance_name from v$instance;

INSTANCE_NAME
----------------
-MGMTDB

This database is an Oracle multitenant database, which one PDB. I need to determine the PDB name. The PDB name will be the same as the cluster name. I can remind myself of the cluster name by querying V$ACTIVE_SERVICES.

SQL> select name,con_id
 2 from v$active_services;

NAME                                                      CON_ID
----------------------------------------------------- ----------
my_cluster                                                     3
-MGMTDBXDB                                                     1
_mgmtdb                                                        1
SYS$BACKGROUND                                                 1
SYS$USERS                                                      1

SQL> alter session set container=my_cluster;

Session altered.

Only one service has the container id not equal to 1 (1 is the CDB) so it must be the PDB I am looking for. I modify my session to use the PDB as its container.

My next task is to get a list of tables owned by CHM.

SQL> select table_name from dba_tables where owner='CHM'
 2 order by table_name;

TABLE_NAME
--------------------------------------------------------------------------------
CHMOS_ACTIVE_CONFIG_INT_TBL
CHMOS_ASM_CONFIG_INT_TBL
CHMOS_CPU_INT_TBL
CHMOS_DEVICE_INT_TBL
CHMOS_FILESYSTEM_INT_TBL
CHMOS_NIC_INT_TBL
CHMOS_PROCESS_INT_TBL
CHMOS_STATIC_CONFIG_INT_TBL
CHMOS_SYSTEM_PERIODIC_INT_TBL
CHMOS_SYSTEM_SAMPLE_INT_TBL

Just 10 tables in the schema. The first table in the list shows some configuration information about the CHM monitored hosts.

SQL> select hostname,NUMPHYCPUS,NUMCPUS,NUMDISKS
 2 from CHM.CHMOS_ACTIVE_CONFIG_INT_TBL;

HOSTNAME   NUMPHYCPUS NUMCPUS    NUMDISKS
---------- ---------- ---------- ----------
host01              1          2          3
host02              1          2          3

I can see that CHM is gathering information about two nodes in the cluster. I can see the number of physical CPU’s for each node and the number of total cores (2). These nodes also have 3 disks.

We can also learn information about the OS.

SQL> select hostname,osname,chiptype
 2 from CHM.CHMOS_STATIC_CONFIG_INT_TBL;

HOSTNAME OSNAME CHIPTYPE
---------- ---------- --------------------
host01 Linux Intel(R)
host02 Linux Intel(R)

There is a good amount of information in these tables and it just takes some trial and error to figure out what is in there. For example, I can use this query to get a count of processes running on host01 ordered over time.

select begintime,count(*)
from CHM.CHMOS_PROCESS_INT_TBL
where hostname='host01'
group by begintime
order by begintime;

I intentionally did not include the output as it would be too long for a blog post. Here are a few more sample queries that you can try on your GIMR database.

Disk I/O activity for a specific host over time.

select begintime,
DISK_BYTESREADPERSEC/1024/1024 as MB_READ_SEC,
DISK_BYTESWRITTENPERSEC/1024/1024 as MB_WRITE_SEC,
DISK_NUMIOSPERSEC as IO_PER_SEC
from CHM.CHMOS_SYSTEM_SAMPLE_INT_TBL
where hostname='host01'
order by begintime;

Swapping on a specific host over time.

select begintime,swpin,swpout
from CHM.CHMOS_SYSTEM_SAMPLE_INT_TBL
where hostname='host01'
order by begintime;

The next SQL statement will compute a histogram of disk I/O activity. I’m sure someone else can come up with a more elegant version as my SQL statements tend to be more brute-force.

select first.num_count as "<=10ms",
 second.num_count as "<=20ms",
 third.num_count as "<=50ms",
 fourth.num_count as "<=100ms",
 fifth.num_count as "<=500ms",
 final.num_count as ">500ms"
from
(select count(*) as num_count from CHM.CHMOS_DEVICE_INT_TBL 
 where devid='sda1' and latency between 0 and 10) first,
(select count(*) as num_count from CHM.CHMOS_DEVICE_INT_TBL 
 where devid='sda1' and latency between 11 and 20) second,
(select count(*) as num_count from CHM.CHMOS_DEVICE_INT_TBL 
 where devid='sda1' and latency between 21 and 50) third,
(select count(*) as num_count from CHM.CHMOS_DEVICE_INT_TBL 
 where devid='sda1' and latency between 51 and 100) fourth,
(select count(*) as num_count from CHM.CHMOS_DEVICE_INT_TBL 
 where devid='sda1' and latency between 101 and 500) fifth,
(select count(*) as num_count from CHM.CHMOS_DEVICE_INT_TBL 
 where devid='sda1' and latency > 500) final;

<=10ms     <=20ms     <=50ms     <=100ms    <=500ms    >500ms
---------- ---------- ---------- ---------- ---------- ----------
    150693          10         1          0          0          0

There is a good amount of information in the CHM schema. I expect mostly that this information is just educational and most people will not be querying the CHM tables directly. But this is good information to know and may help others.

↧

SQL Developer 4.1.3 Released

December 28, 2015, 7:53 am

≫ Next: Identifying ASH Sequence Contention in RAC

≪ Previous: A Trip Through the GIMR

The newest version of SQL Developer was released while I was out of the office last week. The download is in the usual place.

http://www.oracle.com/technetwork/developer-tools/sql-developer/downloads/index.html

While you’re there, grab the SQLcl download as well. It was last updated 2 weeks ago.

↧

Identifying ASH Sequence Contention in RAC

January 26, 2016, 9:08 am

≫ Next: Complacency leads to: Risk Becomes Reality

≪ Previous: SQL Developer 4.1.3 Released

In Chapter 3 of Oracle RAC Performance Tuning, I showed how improper CACHE values for sequences can cause poor performance in Oracle RAC. I also showed how to spot sequence contention when looking at a session’s wait events.

Today, I was working with a developer who was creating a new sequence. The developer had a CACHE value of 100, which made me initially thing was too low of a value. I spotted this low setting during code review. The developer thinks the CACHE value is fine but I’m not convinced. We will test this under load to see if the CACHE value needs to be adjusted.

In the meantime, I was thinking “what if I missed this during code review?” And a follow-on question, “what if we didn’t notice anything during load testing?” I want to be able to go back and determine which sequences, if any, would be candidates for having an improper CACHE setting. I could certainly trace sessions and analyze the trace files, but that would be too painful. So I devised a script that I can run against Active Session History to help determine candidate sequences.

select sh.sql_id,to_char(st.sql_text),count(*)
from dba_hist_active_sess_history sh
join dba_hist_sqltext st
 on sh.sql_id=st.sql_id
where st.sql_text like '%NEXTVAL%'
 and (event='row cache lock' or event like 'gc current block %-way')
group by sh.sql_id,to_char(st.sql_text)
order by count(*) desc;

This is not a perfect science due to the nature of ASH collection. The session experiencing the contention would need to be caught at just the right time to be in the DBA_HIST_ACTIVE_SESSION_HISTORY table. But the SQL statement above does give me some candidates for consideration. Not every sequence accessed in the SQL statements returned need to have their CACHE values adjusted. Further analysis would be needed. However, this does give me a list of ones to consider. And it can help answer my initial questions. If I missed the sequence creation during code review, I can hopefully find it later if the sequence is a problem for application performance.

↧

Complacency leads to: Risk Becomes Reality

January 29, 2016, 10:06 am

≫ Next: A Tale Of Two Clustering Factors

≪ Previous: Identifying ASH Sequence Contention in RAC

I was participating in a recent thread on the OTN community where someone was asking questions about downgrading after a database upgrade. One of the responses asked how many people actually practice database downgrades. I created this poll to find out.

I was surprised to find one contribution to that thread which said:

I have done my fair share of upgrades – and never had to downgrade even once

Now that poster didn’t explicitly say it, but it was almost as if that individual was saying that practicing downgrades was a waste of time because they won’t ever need it. I’ll give the poster the benefit of the doubt and that this Oracle employee was not actually saying this. I’m not trying to pick on this individual. I’ll let this thread provide me the opportunity to discuss the topic from a more generic viewpoint. (Update: the poster who prompted me to write this blog entry has come back to the thread in the time it took me to write this and did say, ” did not mean to imply that we should not ‘test’ downgrades.” )

Back in July, I wrote a blog post about The Data Guardian. In that blog post, I said:

the database exists for one reason, to provide efficient and timely access to the data.

The DBA needs to protect the data. That is job #1. Job #2 is for the DBA to provide efficient and timely access to the data. What good is having the data if the people who need access to it cannot get to the data? If those people have terrible performance when interacting with the data, then they might as well have no access.

As the DBA, we need to perform risk management. We need to determine what risks might become reality. The DBAs job is to measure those risks and determine two plans of action. What steps can be taken to avoid that risk becoming reality and what steps do I need to take to resolve the issue when that risk does become a reality?

Even a junior-level DBA will understand the importance of backups. Backups are a risk management strategy. If data is lost, we can recover the data from the backup. And even a junior-level DBA understands the importance of being able to restore from the backup.

In this OTN thread, I wrote this:

Painful fact of anyone in IT: The moment we become complacent and willfully refuse to perform risk mitigation is the moment in time the wheels are set in motion and that risk becomes reality. It is that moment our careers hang in the balance. Will you be prepared to respond?

To me, this is a Murphy’s Law sort of thing. I’ve said similar things in the past. The idea (and its the whole point of this blog entry) is that if I don’t take appropriate risk management steps, then I’m just asking the gods to turn that risk into reality. If I refuse to adjust my rear view mirror and use it when I’m backing up my vehicle, well that’s the day I back into something. If I refuse to tie my shoelaces, well that’s the day I step on one and trip. They day I refuse to wear protective googles when using a powertool is the day I get something in my eye. The day I go to the beach and refuse to put on sun screen is the day I’ll come home with a sunburn. You get the idea.

Some readers may be thinking that I’m crazy and that the universe doesn’t have this master plan to screw with me just because I’m being complacent. And I would agree. So I’ll say it another way, if I do not plan to mitigate risk, then I have done nothing to stop it from becoming a reality. The chances of it becoming a reality do not decrease because of my inaction.

There are two major components to risk management. 1) determining the probability of that risk item occurring and 2) determining the impact when that risk does occur. The items that have the highest probability of occurring are mitigated first. This is easy and something that many working on risk management often do. They put the risk items into a spreadsheet and fill in some value for the probability of that risk occurring. When complete, they sort on the probability column and start risk mitigation from the top down. Many risk management strategies draw a line somewhere in the middle of the list and decide any risk item below that line has too low probability that we won’t worry about that risk item. We can’t mitigate all possible risks in the universe. There just isn’t enough time to handle it all. So we have to draw the line somewhere.

One of the failings I see all the time is that risk management does not spend much time focusing on the impact of that risk becoming reality. The spreadsheet needs to include a similar column providing a rating of the impact to the business for that risk item. The risk manager needs to sort the spreadsheet on this column as well. Any items that have a big impact needs to have risk mitigation activities even if that item has a low probability of occurring! Sadly, too many in the risk management business fail to include this step of assessing the risk impact. Again, when the spreadsheet is sorted by impact to the business, a line is drawn somewhere.

One may find that risk items with a HIGH probability have a LOW or even VERY LOW impact to the business. I like risk management spreadsheets that include a third column which is “probability x impact”. This column helps understand the relationship between the two risk components.

Side Bar: Notice how when I talk about risk management I talk about probability and impact. If you aren’t thinking about both of these areas, then you are only performing half the risk management you should be.

Let’s go back to the database upgrade question that prompted this blog post. I think that everyone reading this blog article should agree that upgrading an Oracle database is a risky proposition. There are so many different things that could go wrong with an Oracle database upgrade. The probability of an upgrade failure is HIGH. Risk mitigation items often include, but are not limited to, practicing the upgrade on clones of production and backing up the database before the upgrade process begins. Why do we do this? Well the impact to the business is VERY HIGH. If we fail when upgrading our production database, then our business users have no access to the data. We aren’t a very good Data Guardian if we cannot get past this failure. If we practice the upgrade sufficiently in non-production environments, we can reduce the probability of the risk item to MEDIUM. But in all likelihood, we cannot reduce that specific risk probability to LOW. That is why we take the backup before the upgrade begins. Should still have problems even though we have done our level-best reduce the probability of that risk item, the impact to the business is still VERY HIGH. So the DBA’s risk remediation strategy is to take notes on where and what caused the upgrade to fail, and to restore from the backup. The database is up and running and we have eliminated the impact to the business. The DBA then goes back to the drawing board to determine how to resolve what went wrong. The DBA is attempting to reducing the probability of that problem occurring again when they back at a later point in time to do the upgrade process again.

So let’s go back to the comment in the OTN thread where it seemed to be saying that practicing database downgrades isn’t worth the time. I disagree. And my disagreement has everything to do with the impact to the business. I do agree with the the comment the poster said in their reply.

thorough testing of all of the critical parts will identify any issues and have them resolved before the production upgrade.

I agree with that 100%. Why do we do this “thorough testing”? It is all because of risk mitigation. We are attempting to reduce the probability that the upgrade will cause poor performance or cause application functionality to break. But even as that poster said, “There will always be issues that pop-up in production after the upgrade because it is impossible to test 100% of your application.” Again, I agree 100% with what this poster is saying here. But what about the impact to the business? I’ll get to that in a minute, but first I have to digress a bit in this next paragraph…

I recently upgraded a critical production system from 11.2.0.4 to the 12.1.0.2 version. Where I work, we have more application testing than I’ve ever seen in my other jobs. We have a full QA team that does testing for us. We even have a team that is in charge of our automated testing efforts. We have automated robots that exercise our application code nightly. On top of all of that, we have another automated routine that whenever people push code changes to Test or Prod, this routine does a quick examination of critical code paths. I upgraded development environments (more than 15 of them) to 12.1.0.2 and then waited one month. I then upgraded Test and waited 3 weeks before I upgraded production. There were issues found and resolved before we upgraded production. But even after all of that, I had big issues once production was upgraded. You can visit my blog posts in mid-October to mid-December to see some of those issues. I was very close to downgrading this database but I managed to work through the issues instead. Now back to the point I was making…

After the upgrade is complete, the database is opened for business. Application users are now allowed to use the application. What happens inside the database at this point? Transactions! And transactions mean data changes. At the point in time the DBA opens the database for business after an upgrade is complete, data changes start occurring. After all this that’s the whole point of the the database, isn’t it? Capture data changes and make data available to the application’s end users.

So what happens if you’re in the boat I was last Fall with my database upgrade? I was hitting things that we did not see in non-production, even after all of our testing. The impact to the business was HIGH. I need to be able to reduce this impact to the business. I had three options. 1) Fix the issues, one by one. 2) Restore from the backup I took before the upgrade so that I could get the database back to the old version. 3) Downgrade the database and go back to the drawing board. I chose the first option. as I always have during my career. But what if that was not sufficient? It can take time to resolve the issues. Some businesses simply cannot afford that kind of time with that negative impact to the business. How many websites have been abandoned because performance was terrible or things didn’t work correctly? And for the strong majority of production databases out there, option 2 has a very terrible impact to the business! You’ll lose transactions after the upgrade was completed! The DBA won’t be able to roll forward past the upgrade while keeping the database at the old version, so data will be lost and for many production databases, this is unacceptable. The business may be able to afford one hour of data loss, but how many people would pull the trigger on this action within one hour of the upgrade? In all likelihood, this action would be performed days after the upgrade and the impact to the business for that kind of data loss is well above VERY HIGH. So that leaves option 3 as the option with the lowest impact to the business to help resolve whatever impacts the business is experiencing after the upgrade.

You can probably tell from that last paragraph that I feel that it is important for the Oracle DBA to know how to downgrade their database after an upgrade is complete. I’ll concede that the probability of the DBA needing to perform a downgrade is VERY LOW. But the impact of not downgrading may be catastrophic to the business. (There’s those two words again). Because the probability is low, I don’t practice downgrades often, but because the impact of not being able to downgrade is very high, I do practice them once in awhile.

So in closing, I’m going to go back to that Murphy’s Law thing again. The universe is not conspiring against me, but as the Data Guardian, I need to practice good risk management principles. That means assessing the probability and the impact of risk items imposed by my change. While the universe and the gods may not make Murphy’s Law or its cousins kick into gear, I’m not going myself any favors by mitigating risk items. I am not reducing the probability one bit.

↧

A Tale Of Two Clustering Factors

February 17, 2016, 12:17 pm

≫ Next: Oracle 12c IDENTIFIED BY VALUES

≪ Previous: Complacency leads to: Risk Becomes Reality

I was looking at a post on the MOSC forums today about the Clustering Factor (CF) for an index. One thing that people tend to forget when talking about the CF is that while the DBA can do some reorg activity to improve the CF for an index, it will potentially come at the expense another index for that same table. Consider this example which I provided in that thread.

Here I have a table with two indexes. Its the only table in my schema. One index (IDX2) has a CF much higher than the other (IDX1).

SQL> select index_name,clustering_factor from user_indexes;

INDEX_NAME      CLUSTERING_FACTOR
--------------- -----------------
MY_TAB_IDX2                135744
MY_TAB_IDX1                  2257

The DBA want’s to “fix” this issue. The DBA wants to reduce the CF for IDX2. The best way to do that is to pull the data out of the table and then insert back, sorted by the column(s) IDX2 is built on.

SQL> create table my_tab_temp as select * from my_tab;

Table created.

SQL> truncate table my_tab;

Table truncated.

SQL> insert into my_tab select * from my_tab_temp order by pk_id;

135795 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>USER,tabname=>'MY_TAB',cascade=>TRUE);

PL/SQL procedure successfully completed.

SQL> select index_name,clustering_factor from user_indexes;

INDEX_NAME      CLUSTERING_FACTOR
--------------- -----------------
MY_TAB_IDX2                  2537
MY_TAB_IDX1                135747

Now the CF for IDX2 has definitely improved. But look at the CF on IDX1. It got much worse. In fact, the two indexes seemed to have flipped the CF values. If I attempt antoher reorg, this time ordering by the IDX1 column(s), then the CF values will flip again.

The moral of this story is that one can’t guarantee that improving the CF for one index won’t have a negative affect on another index of that table.

↧

Oracle 12c IDENTIFIED BY VALUES

February 18, 2016, 12:29 pm

≫ Next: SQL Server 2016 on Linux

≪ Previous: A Tale Of Two Clustering Factors

Since I can recall in my career working with Oracle, I’ve been able to modify a user’s password to the password hash. The trick I’ve sometimes employed was to store the userid/password hash, change the password to something I know, connect to the database as that user and then do my work. When done with my work, set the password back to whatever it was similar to the following:

ALTER USER bob IDENTIFIED BY VALUES ‘asdf1234%^&*qwerty';

I never needed to know the user’s password to set it back to what it was so long as I knew the hash value was. Yesterday I found some information where people were receiving the following error when attempting to set a password this way in 12c:

ORA-02153: invalid VALUES password string

If you lookup this error in My Oracle Support, you will most likely land on Note 2096579.1. In that note, it states that this method is no longer possible. It says “This is new functionality in 12c to force users to be created in the right way”. But I’ve found that this is not exactly true.

Oracle 12c introduced new functionality to make the userid/password hash values more secure. Here is a link to the 12c Security Guide where it talks about the 12c Verifier for passwords. Note in that section, it mentions a salt value added to the password when it is hashed. To see why this is important, let’s look at an example. I’ll create a user and look at the userid/password hash stored in the SPARE4 column of SYS.USER$.

SQL> create user bob identified by abc123;

User created.

SQL> grant create session to bob;

Grant succeeded.

SQL> select spare4 from sys.user$ where name='BOB';

SPARE4
--------------------------------------------------------------------------------
S:44F34BA1369D58A6CB262D166587D5238D9148FC9BDD390A98C29A3B6A34;H:FD30F9DA6ECB907
6C10C04D20AFF9492;T:450FF7F2A4BB8104E33E7C09FF1698AEA2DE3EBD60BFA681942057D83EE2
DD773BB4F7B1046355D1CB63EBF256BC7B466BB1B3185A0988D1CBAE3276D1B181756DB27BB40505
8C44152DB2DD41074396

In previous versions, the SPARE4 column wouldn’t contain nearly that many characters. This is definitely more complex than pre-12c versions. My guess, although unconfirmed, is that the S: part of the output above is the salt value. I’m not sure what H: and T: represent.

We can use the DBMS_METADATA package to reverse engineer a user. When we do that, we can see that we still can use the IDENTIFIED BY VALUES clause.

SQL> select dbms_metadata.get_ddl('USER','BOB') from dual;

DBMS_METADATA.GET_DDL('USER','BOB')
--------------------------------------------------------------------------------

CREATE USER "BOB" IDENTIFIED BY VALUES 'S:44F34BA1369D58A6CB262D166587D5238D9
148FC9BDD390A98C29A3B6A34;H:FD30F9DA6ECB9076C10C04D20AFF9492;T:450FF7F2A4BB8104E
33E7C09FF1698AEA2DE3EBD60BFA681942057D83EE2DD773BB4F7B1046355D1CB63EBF256BC7B466
BB1B3185A0988D1CBAE3276D1B181756DB27BB405058C44152DB2DD41074396;5844087A3D506FD3
'
 DEFAULT TABLESPACE "USERS"
 TEMPORARY TABLESPACE "TEMP"

And in fact, that does work. I’ll change BOB’s password to something different, then change it to this hash value and connect with the old password.

SQL> alter user bob identified by newpass;

User altered.

SQL> alter user bob identified by values 'S:44F34BA1369D58A6CB262D166587D5238D9148FC9BDD390A98C29A3B6A34;H:FD30F9DA6ECB9076C10C04D20AFF9492;T:450FF7F2A4BB8104E33E7C09FF1698AEA2DE3EBD60BFA681942057D83EE2DD773BB4F7B1046355D1CB63EBF256BC7B466BB1B3185A0988D1CBAE3276D1B181756DB27BB405058C44152DB2DD41074396;5844087A3D506FD3';

User altered.

SQL> connect bob/abc123
Connected.

So we haven’t lost any functionality as the MOS Note implied. We just have to deal with a much longer hash value here.

Where this becomes really important is when trying to use exp/imp or Data Pump to move users from a pre-12c version to 12c. If you do a FULL export of an Oracle 11g database, the dump will contain the old password hash values. When importing into 12c, that is when you’ll receive the ORA-02153 error. To get around this issue, pre-create the users in the 12c database with known passwords.

↧

SQL Server 2016 on Linux

March 8, 2016, 7:56 am

≫ Next: V$SQL_SHARED_CURSOR TOP_LEVEL_RPI_CURSOR

≪ Previous: Oracle 12c IDENTIFIED BY VALUES

Yesterday, Microsoft announced that it will be shipping a version of SQL Server 2016 (to be released later this year) that will run on Linux. It didn’t take long for the media to get the word out. I quickly found a story here and here.

Right now, SQL Server 2016 is only available for early beta testing for a few select groups and I am not one of them. So I can only speculate what MSSQL on Linux will look like. I’ll be very curious how well SQL Server will work on Linux. I expect some functionality to either not be available or look totally different. It has to. For starters, I’m used to logging into my Windows workstation, authenticated by Active Directory. That serves as a Single Sign On for connecting to SQL Server, as SQL Server has native integration with AD. How will this work on Linux? SQL Server has lots of integration with WMI, which we’ll lose on Linux. What are all of those DBA’s going to do to have to convert their Powershell scripts? I setup SQL Server to use the Event Viewer for an audit trail. I’m guessing I’ll have to write to a text file on Linux. SQL Server is tightly integrated with Windows. Setting up a MS Failover Cluster was a snap and getting an Active/Passive SQL Server instance up and running on the FC was child’s play. All of this was made very easy due to the tight integration between the RDBMS and the OS. How will this change on Linux?

Which Linux distro can I run SQL Server on? I’ve read that Microsoft worked with Canonical quite a bit. Will Ubuntu be the only Linux I can run SQL Server on? Or will I see the two big dogs working together once again, meaning Microsoft SQL Server on Oracle Linux?

I haven’t found much information as to why MS is now going to let SQL Server run on Linux. I’ve seen some media reports quote CEO Satya Nadella where he indicated that MS was going to embrace Open Source more. I’ve heard other media reports mention that this was a way to get SQL Server installed more in the cloud. But I learned a real long time ago, that when you want to know the motivation behind a business decision, it all comes down to money. Fifteen or ten years ago, if you were setting up a new database system, you chose a RDBMS platform. The only question was which one. Today’s non-traditional database systems (MongoDB, Hadoop, etc) have changed the landscape significantly. We’ve all seen the stories about the impact of these non-RDBMS database platforms on Oracle Corp’s revenue stream and how it is helping to promote Oracle’s rush to the cloud. Well don’t think this is an Oracle-only issue. Other RDBMS vendors are under the same pressures Oracle is. Simply put, providing a means to let SQL Server run on Linux is Microsoft’s way of increasing the product’s potential marketplace. Follow the money, and you’ll see that this decision is all about trying to increase market share in a highly competitive market.

I’ve always been vocal about the fact that I hate running Oracle on Windows! Back in the Oracle 8 and 8i days, patching was a nightmare. Native Windows OS scripting has never been great so I learned to rely on Perl back in those days. My preference for Oracle has always been to run it on Unix/Linux. A few years ago, my company bought a competitor and I inherited an Oracle database that still runs on Windows to this day. My skin crawls when I have to sign on to the server to do some administrative tasks. That server will finally be terminated this year and I’ll be rid of Oracle on Windows here.

All that being said, I cannot see where I would rush to run SQL Server on Linux. I’m sure I’ll load it up once and play around with it. But for real work, I’ll still run MSSQL on Windows. The tight integration makes a number of things easier and I do not see any advantages so far in making the OS switch. But I’m till very interested in seeing it run.

My Twitter feed blew up yesterday with this announcement. Of course I follow a lot of Oracle people. Many are wondering if Hell froze over or if pigs are now flying. This announcement has done one thing, even before the product ships. It has generated lots of buzz. A lot of people are talking about SQL Server today.

↧

V$SQL_SHARED_CURSOR TOP_LEVEL_RPI_CURSOR

April 5, 2016, 1:47 pm

≫ Next: Oracle RAC N+1 Redundancy

≪ Previous: SQL Server 2016 on Linux

I was working with an individual on a question in the MOSC forums recently where they asked about the TOP_LEVEL_RPI_CURSOR column of the V$SQL_SHARED_CURSOR view. There is little documentation on what this column is trying to tell the DBA.

All the Oracle docs say is that this column contains “(Y|N) Is top level RPI cursor”. So what does that mean?

I’m going to assume the reader of this post is familiar with child cursors. That will save me a large amount of introductory information. The V$SQL_SHARED_CURSOR view will tell the DBA why a child cursor and its parent have different versions in the Shared Pool. If the child cursor’s OPTIMIZER_MISMATCH column contains a ‘Y’ in this view, then the session executing the cursor had different optimizer settings than the session that was responsible for the parent cursor execution.

So what does it mean when TOP_LEVEL_RPI_CURSOR is set to Y for a child? The documentation isn’t clear. MOS has very little on the subject. And all of my Google hits on this column pretty much just regurgitate the documentation. To know why, it helps to know that RPI stands for Recursive Program Interface. This is part of the Oracle kernel that deals with recursive SQL. In our case, it deals with the fact that the SQL statement was issued at a different “depth”.

What is recursive SQL? It is SQL that is issued on your behalf, which means at a different depth as I will illustrate. First off, Oracle is performing recursive SQL all the time. At a basic level, when you issue “select * from table_name”, Oracle queries the Data Dictionary to ensure the object exists and that you have permissions on that table. How does Oracle do that? It uses other SQL statements. The statement you issue is at level 0, the base level. When Oracle issues a SQL statement to check if the table exists, that will be at the next level, level 1. Sometimes, that will cause other SQL statements to be issued at the next level, level 2.

The depth of a SQL statement is not limited to just what Oracle is doing in the background, on your behalf. Consider when you execute a stored procedure. Your call to the stored procedure is at depth 0. Any SQL statement in the stored procedure is at depth 1. If that stored procedure calls another procedure, the SQL in the other procedure will be at depth 2.

I used this information on recursive SQL and SQL depth to construct a simple example in my Oracle 12.1.0.2 database. First, I created a stored procedure.

create or replace procedure my_sysdate 
as 
 v_dt date;
begin
 select sysdate into v_dt from dual;
end;
/

I then fired up a SQL*Plus session and started a trace. I issued the same SQL statement and then I called my procedure.

SQL> alter session set sql_trace=true;

Session altered.

SQL> SELECT SYSDATE FROM DUAL
 2 /

SYSDATE
---------
05-APR-16

SQL> exec my_sysdate;

PL/SQL procedure successfully completed.

SQL> exit

When I examined the raw trace file, I found the two calls to SYSDATE from DUAL as follows:

PARSING IN CURSOR #140670990815296 len=24 dep=0 uid=9449 oct=3 lid=9449 tim=24905125014484 hv=124468195 ad=’81477be0′ sqlid=’c749bc43qqfz3′ SELECT SYSDATE FROM DUAL

PARSING IN CURSOR #140670907623848 len=24 dep=1 uid=9449 oct=3 lid=9449 tim=24905129780963 hv=124468195 ad=’81477be0′ sqlid=’c749bc43qqfz3′ SELECT SYSDATE FROM DUAL

If you look at the trace file closely, you will see that the second one at depth=1 was a direct result of the stored procedure. Notice that even though my stored procedure was defined in all lower case, the SQL issued at depth=1 was in all upper case. As a result, when I issued the same SQL statement directly in my SQL*Plus session (at depth=0), I had to use the same upper case form of that statement so that it would have the same SQL ID value.

The trace file also shows the SQL ID. I can now query V$SQL_SHARED_CURSOR for that SQL ID value and show that TOP_LEVEL_RPI_CURSOR is set for the child.

SQL> select sql_id,top_level_rpi_cursor from v$sql_shared_cursor where sql_id='c749bc43qqfz3';

SQL_ID T
------------- -
c749bc43qqfz3 N
c749bc43qqfz3 Y

So there we have our proof. The only difference between these two cursors is that one was the depth from which they were executed. I’m not sure why Oracle needs this distinction in the Shared Pool. If anyone knows, drop me a line.

Normally, we do not care about a few extra versions, a few child cursors for a given SQL ID. If your SQL statement has a high number of versions, then it is probably not due to the different depth levels. Other reasons would be more relevant to why a SQL statement would have a high number of child cursors, a high number of different versions. But this does answer the question of what that column is telling us.

↧

Oracle RAC N+1 Redundancy

April 12, 2016, 8:59 am

≫ Next: Recreate Bad RAC Node

≪ Previous: V$SQL_SHARED_CURSOR TOP_LEVEL_RPI_CURSOR

I find that when people are designing Oracle RAC architecture, they often do not think of N+1 redundancy in their implementation plans. There are two reasons to implement Oracle RAC, availability and scalability. For the purposes of this discussion, I am focusing only on the availability side. If your RAC deployments are for scalability reasons only, then this topic may not apply to you.

So what is N+1 Redundancy? Simply put, if you need N units of something, then for redundancy purposes, you should have N+1 of that item. Let’s look at a database server. It must have a power supply. That is a requirement. Without a working power supply, the server will not function at all. The minimum number of power supplies is 1. If we want this server to have a high degree of availability, we will make sure it has N+1 power supplies, or in this case, dual power supplies. If there is only one power supply and it fails, it takes the server with it. If we have an extra power supply, a spare unit, the loss of one power supply will not take down the server with it. Redundancy is a great thing to have and an essential component to a high availability infrastructure.

When designing an Oracle RAC system, the DBA needs to determine how many nodes are needed to support the end user’s demands. If the DBA determines 4 nodes are needed, and this RAC cluster must exhibit high availability traits, then it is vital for the DBA to create a 5 node cluster (4+1). If the resource demands are sufficient to keep 4 nodes busy and one node is lost, the remaining 3 will not be able to keep up with the workload. If the DBA builds the RAC system with N+1 capability in mind, then the loss of one node will not be noticeable by the end users. If the DBA builds the RAC cluster without N+1 redundancy, then the loss of one node may be so terrible for the end user’s performance, that the entire cluster might as well be down. When designing your RAC implementations, strive for N+1 redundancy.

I remember two years ago, I had a RAC cluster that lost a node. No problem, we still had two nodes available. As I watched the performance of the two remaining nodes, they seemed to be pretty overwhelmed. Our call center started receiving complaints. I worked with other administrators on the IT team to get that node back up and running as fast as possible, but this may not always be the case if the reason for the outage is hardware related and parts need to be replaced. After the node was back in service, I monitored the cluster performance for weeks later. Our usage had grown since this system was initially designed. We had initially designed this system with N+1 redundancy in mind, but our usage grew and N went from 2 to 3. Our current 3-node cluster was no longer N+1 redundant. So I worked with management to put into the next year’s budget enough funds to procure a new node and make sure Oracle was licensed on it. I sleep much better at night knowing that I am back to N+1 redundancy.

Like many implementations out there, my RAC system is not the only High Availability feature built into our infrastructure. This RAC database is a primary to a physical standby database with Oracle’s Data Guard. I’m surprised when discussing RAC standby database’s with other Oracle DBA’s how many of them are not thinking of N+1 capability for their standby. The physical standby database is my safety net in case the primary data center is unavailable for some reason. I’ve seen so many Oracle DBA’s implement a single instance standby for a multi-node RAC primary. Ouch! I hope they never have to fail over. Their entire multi-node RAC cluster’s workload will struggle mightily on that single instance standby. So as you’re designing your RAC implementations for both the primary and the standby, consider your N+1 redundancy implications on the architecture design.

Where I probably differ from many people is that my physical standby implementations are not N+1 capable, but rather N. I skip the redundant extra node for my physical standby. Why is that? Purely from a cost perspective. My physical standby is just a safety net. I want it to work for me the day that I need it. But I hopefully never need it. The physical standby is my insurance policy in case risk becomes reality. For me, that extra “+1″ at the standby site is over-insurance. I can save on the physical hardware and Oracle licensing.

So let’s say the day comes and I do failover to the standby. I have lost my N+1 redundancy. But what are the chances that I’m going to lose the primary data center *and* lose one of the nodes in my standby cluster? Pretty slim chances. The likelihood of failures at two sites at the same time is pretty small. At this point, our IT team is evaluating why our primary data center is lost and when we can most likely return our operations to that facility. If the primary data center lost all its power and the utility company says service will be restored by tomorrow, then we’ll just simply run at the standby data center even though we only have N nodes for the RAC database there. However, if the primary data center was wiped out by a fire it will like take many months before it is up and running again. It is at this point that I need to plan on getting that physical standby up to N+1 redundancy as our time using that standby as a primary will be a much longer period. So we rush order another server and add it to the cluster as soon as possible. So I design my standby RAC database as N, not N+1 with an eye on increasing it to N+1 in short order if we determine we will be using the standby for real for a longer period of time.

So there is one other special case I would like to discuss. That is where the DBA determines that N=1. For the current workload requirements, one node is sufficient. But we want to have high availability so we design a two-node RAC cluster for the primary database. We now have N+1 redundancy built into the primary. Following my last paragraph, my standby database only needs 1 node. The mistake I see some people make is to create the standby as a single-instance database. So far, their logic makes sense. The primary is N+1 and the standby is N. So far so good. Where I differ is that I make the standby a one node RAC cluster, not a pure single-instance implementation. The reason is for future growth. At some point, the DBA may find that N no longer equals 1 at the primary. Usage has grown and N needs to be 2 now. The DBA wants to grow the primary to 3 nodes (2+1). This is easily down with zero downtime to add a new node to the cluster and extend the RAC database to that new node. But its not so easily done at the standby to make the standby a 2-node cluster if that 1 node that exists is not RAC-enabled. If a pure single-instance standby is all that exists, the DBA needs to scrap it and move it to a two-node cluster. If the DBA had foresight and installed Grid Infrastructure as if the physical standby were a single-node cluster, then all the DBA has to do is to add a new node, just like they did on the primary side.

As you’re designing your RAC implementations, consider ensuring your have N+1 capability on the primary and at least N on the standby. If a company determines that the standby is too critical, they may want to implement N+1 at the standby as well. If the DBA determines that N=1, consider making the standby at least a single node RAC cluster.

↧

Recreate Bad RAC Node

April 26, 2016, 8:35 am

≫ Next: EM SQL Monitor Impact

≪ Previous: Oracle RAC N+1 Redundancy

Just recently, I was trying to apply the latest and greatest Patch Set Update (PSU) to a 2-node Oracle RAC system. Everything went smoothly on the first node. I did have problems when trying to apply the PSU to the second node. The problem wasn’t with OPatch or the PSU, but rather, I could not even bring down Grid Infrastructure (GI) successfully. And to make matters worse, it would not come up either.

I tracked my issue down to the Grid Inter Process Communication Daemon (gipcd) When issuing ‘crsctl stop crs’, I received a message stating that gipcd could not be successfully terminated. When starting GI, the startup got as far as trying to start gipcd and then it quit. I found many helpful articles on My Oracle Support (MOS) and with Google searches. Many of those documents seemed to be right on track with my issue, but I could not successfully get GI back up and running. Rebooting the node did not help either. The remainder of this article can help even if your issue is not with gipcd, it was just the sticking point for me.

So at this juncture, I had a decision to make. I could file a Service Request (SR) on MOS. Or I could “rebuild” that node in the cluster. I knew if I filed a SR, I’d be lucky to have the node operational any time in the next week. I did not want to wait that long and if this were a production system, I could not have waited that long. So I decided to rebuild the node. This blog post will detail the steps I took. At a high level, this is what is involved:

Remove the node from the cluster
Cleanup any GI and RDBMS remnants on that node.
Add the node back to the cluster.
Add the instance and service for the new node.
Start up the instance.

In case it matters, this system is Oracle 12.1.0.2 (both GI and RDBMS) running on Oracle Linux 7. In my example, host01 is the “good” node and host02 is the “bad” node. The database name is “orcl”. Where possible, my command will have the prompt indicating the node I am running that command from.

First, I’ll remove the bad node from the cluster.

I start by removing the RDBMS software from the good node’s inventory.

[oracle@host01]$ ./runInstaller -updateNodeList ORACLE_HOME=$RDBMS_HOME "CLUSTER_NODES={host01}" 
LOCAL_NODE=host01

Then I remove the GI software from the inventory.

[oracle@host01]# ./runInstaller -updateNodeList ORACLE_HOME=$GRID_HOME "CLUSTER_NODES={host01}" 
CRS=TRUE -silent

Now I’ll remove that node from the cluster registry.

[root@host01]# crsctl delete node -n host02

CRS-4661: Node host02 successfully deleted.

Remove the VIP.

[root@host01]# srvctl config vip -node host02
VIP exists: network number 1, hosting node host02
VIP Name: host02-vip
VIP IPv4 Address: 192.168.1.101
VIP IPv6 Address: 
VIP is enabled.
VIP is individually enabled on nodes: 
VIP is individually disabled on nodes: 
[root@host01]# srvctl stop vip -vip host02-vip -force
[root@host01]# srvctl remove vip -vip host02-vip
Please confirm that you intend to remove the VIPs host02-vip (y/[n]) y

Then remove the instance.

[root@host01]# srvctl remove instance -db orcl -instance orcl2
Remove instance from the database orcl? (y/[n]) y

At this point, the bad node is no longer part of the cluster, from the good node’s perspective.

Next, I’ll move to the bad node and remove the software and clean up some config files.

[oracle@host02]$ rm -rf /u01/app/oracle/product/12.1.0.2/
[root@host02 ~]# rm -rf /u01/grid/crs12.1.0.2/*
[root@host02 ~]# rm /var/tmp/.oracle/*
[oracle@host02]$ /tmp]$ rm -rf /tmp/*
[root@host02]# rm /etc/oracle/ocr*
[root@host02]# rm /etc/oracle/olr*
[root@host02]# rm -rf /pkg/oracle/app/oraInventory
[root@host02]# rm -rf /etc/oracle/scls_scr

I took the easy way out and just used ‘rm’ to remove the RDBMS and Grid home software. Things are all cleaned up now. The good node thinks its part of a single-node cluster and the bad node doesn’t even know about the cluster. Next, I’ll add that node back to the cluster. I’ll use the addnode utility on host01.

[oracle@host01]$ cd $GRID_HOME/addnode

[oracle@host01]$ ./addnode.sh -ignoreSysPrereqs -ignorePrereq -silent "CLUSTER_NEW_NODES={host02}" 
"CLUSTER_NEW_VIRTUAL_HOSTNAMES={host02-vip}"

This will clone the GI home from host01 to host02. At the end, I am prompted to run root.sh on host02. Running this script will connect GI to the OCR and Voting disks and bring up the clusterware stack. However, I do need to run one more cleanup routine as root on host02 before I can proceed.

[root@host02]# cd $GRID_HOME/crs/install

[root@host02]# ./rootcrs.sh -verbose -deconfig -force

It is possible that I could have run the above earlier when cleaning up the node. But this is where I executed it at this time. Now I run the root.sh script as requested.

[root@host02]# cd $GRID_HOME

[root@host02]# ./root.sh

At this point, host02 is now part of the cluster and GI is up and running. I verify with “crs_stat -t” and “olsnodes -n”. I also check the VIP.

[root@host02]# srvctl status vip -vip host02-vip
VIP host02-vip is enabled
VIP host02-vip is running on node: host02

Now back on host01, its time to clone the RDBMS software.

[oracle@host01]$ cd $RDBMS_HOME/addnode

[oracle@host01]$ ./addnode.sh "CLUSTER_NEW_NODES={host02}"

This will start the OUI. Walk through the wizard to complete the clone process.

Now I’ll add the instance back on that node.

[oracle@host01]$ srvctl add instance -db orcl -instance orcl2 -node host02

If everything has gone well, the instance will start right up.

[oracle@host01]$ srvctl start instance -db orcl -instance orcl2

[oracle@host01]$ srvctl status database -d orcl
Instance orcl1 is running on node host01
Instance orcl2 is running on node host02

SQL> select inst_id,status from gv$instance;

INST_ID STATUS
---------- ------------
 1 OPEN
 2 OPEN

Awesome! All that remains is to reconfigure and start any necessary services. I have one.

srvctl modify service -db orcl -service hr_svc -modifyconfig -preferred "orcl1,orcl2"

srvctl start service -db orcl -service hr_svc -node host02

srvctl status service -db orcl

That’s it. I now have everything operational.

Hopefully this blog post has shown how easy it is to take a “bad” node out of the cluster and add it back in. This entire process took me about 2 hours to complete. Much faster than any resolution I’ve ever obtained from MOS.

I never did get to the root cause of my original issue. Taking the node out of the cluster and adding it back in got me back up and running. This process will not work if the root cause of my problem was hardware or OS-related.

And the best part for me in all of this? Because host01 already had the PSU applied bo both GI and RDBMS homes, cloning those to host02 means I did not have to run OPatch on host02. That host received the PSU patch. All I needed to do to complete the patching was run datapatch against the database.

↧

EM SQL Monitor Impact

April 28, 2016, 12:26 pm

≫ Next: Oracle RAC VIP and ARP Primer

≪ Previous: Recreate Bad RAC Node

In case anyone needs a reminder, its always a good idea to determine the impact of your monitoring tools on the very database you are monitoring. Some monitoring tools are lightweight and others are more obtrusive. I am using Enterprise Manager 13c to monitor a specific SQL statement while its running. I noticed in another monitoring tool (Lighty by Orachrome) the following SQL statement was consuming a good amount of resources:

WITH MONITOR_DATA AS (
SELECT
INST_ID
,KEY
,NVL2 (
PX_QCSID
,NULL
,STATUS
) STATUS
,FIRST_REFRESH_TIME
,LAST_REFRESH_TIME
,REFRESH_COUNT
,PROCESS_NAME
,SID
,SQL_ID
,SQL_EXEC_START
…

I cut off the rest of the text. This SQL statement is literally a few thousand lines long. Yikes! But that isn’t the issue. In Lighty, I noticed the activity in this screen shot.

The top SQL statement is my CPU pig. I blacked out the SQL text to protect potentially proprietary information. Notice that last SQL statement. It is consuming a fair amount of resources for just monitoring the system.

Here is a screenshot of the EM13c window.

When I turned off the Auto Refresh (it defaults to 15 seconds), the activity ceased on the system. I then manually press the refresh button when I need an update.
There are certainly times to use the auto refresh, even every 15 seconds. Just be mindful of the potential negative impact on the database.

↧

Oracle RAC VIP and ARP Primer

June 7, 2016, 12:52 pm

≫ Next: Fog Computing

≪ Previous: EM SQL Monitor Impact

I’ve been running across a number of questions about Virtual IP (VIP) addresses for Oracle RAC lately. I hope this blog post can help shed some light on what a VIP is, how they work, and why Oracle RAC leverages them. Before I go further, I should explain that I am not a Network specialist. In fact, computer networking is probably my weakest point of everything that goes on in IT shops. So do not flame me if I do not get the networking stuff entirely, 100% accurate. I will explain this in terms that have served me well in my DBA career, especially working with Oracle RAC.

Most people are familiar with connecting any application, SQL*Plus or others, to a single-instance database server. In the end, their connection request is sent to a specific IP address. In our diagram below, the end user wants to connect to 192.168.1.1 to access the database. The network request gets routed to the network switch that that database server is connected to. This switch passes the request to the server that has the requested IP address.

Most Oracle DBAs do not have a problem understanding this concept. Life does get a little more complicated when RAC is deployed as there are multiple machines (nodes) in the cluster. In the next diagram, we have a two-node RAC cluster, each node having a different IP address.

The end user doesn’t care which node his session is connected to. The end user just wants access to the cluster. Either node will suffice. The end user’s TNSNAMES.ORA configuration may say to try 192.168.1.1 first and if that doesn’t work, try 192.168.1.2 instead. In this way, Oracle RAC is providing a High Availability solution.

Now we come to the entire reason for Virtual IP addresses to be used. What if the end user is trying to access the first node (192.168.1.1) but it is unavailable? The node is down for some reason. The end user could easily connect to the 192.168.1.2 node. However, due to the way TCP/IP networks work, it can take up to ten minutes for the network connection to 192.168.1.1 to timeout before 192.168.1.2 will be accessed. The lengthy TCP/IP timeout wait is the sole reason for Oracle RAC to leverage VIPs. We simply want to reduce the time to access another node in the cluster should our requested node be unavailable.

A traditional IP is usually bound to the network card on the server. The IT admin will configure the server to always use that specific IP address and no other devices on the network will use the same IP. Note: I’m trying to make this simple here and avoiding DHCP and lease registration for those that are familiar with the topics.

A virtual IP address is not bound to the network card. It isn’t even defined in the OS. The VIP is not a real IP address similar to the way a Virtual Machine is not a real system. It just appears to be real to those using it. So let’s look at our two node cluster, but this time with VIP’s defined for them.

Our servers still have their regular IP addresses, 192.168.1.1 and 192.168.1.2 for NODE1 and NODE2 respectively. These two nodes also have VIPs associated with them. NODE1-VIP and NODE2-VIP are denoted as IP addresses 192.168.1.11 and 192.168.1.12 respectively. Each node in the RAC cluster has its regular IP address and a VIP. It may also be beneficial to know that the host name and the VIP names are often defined in DNS so that we do not have to remember the IP addresses specifically.

Notice that the end user is now requesting to access one of the VIPs. The only people who should be using these traditional IP addresses are IT administrators who need to perform work on the server. End users and any and all applications should connect with the VIP.

Remember that I said earlier that the VIP isn’t even defined in the OS? Well if that’s the case, then how does everything know that the VIP is assigned to that node? This is all handled by Grid Infrastructure (GI). When GI is installed, one of the Oracle Universal Installer (OUI) screens will ask for the names of the nodes in the cluster (the host names) along with the virtual hostname. The screen shot below shows how the 11g GI installation looked when asking for that information (screen shot from Oracle documentation).

The public hostname is configured in the OS by the administrator. The Virtual IP is not configured in the OS but Grid Infrastructure knows about it. To understand how this works, we need to digress a bit and understand Address Resolution Protocol (ARP).

When a server is started up and its networking components are initiated, Address Resolution Protocol is the mechanism that tells the switch in front of the server to route all traffic for its IP address to the MAC address of its network card. The OS, through ARP, tells the switch to go to NODE1 for 192.168.1.1 requests.

When Grid Infrastructure starts, one of its startup tasks is to do something similar. GI, through ARP, tells the switch to go to NODE1 for all NODE1-VIP (192.168.1.11) requests. Until GI starts the VIP that VIP address is un-routable.

Now here’s the magic part…when NODE1 goes down, GI on another node in the cluster will detect the outage. GI will then perform a new ARP operation that informs the switch to route the VIP to another node in the cluster. Because the VIP is virtual, it can be re-routed on the fly. In the diagram below, NODE1 has failed. Its traditional IP is no longer available as well. GI has re-ARPed the VIP to the remaining node in the cluster.

The re-ARPing of the VIP can be accomplished in seconds. The end user may experience a brief pause in their network communication between the application and the database instance, but this is much, much less than if we waited for TCP/IP timeouts.

Oracle 11gR2 introduced the SCAN Listeners. An Oracle RAC cluster can have at most three SCAN Listeners. The SCAN name is still in DNS but DNS will round-robin the SCAN name resolution to one of up to three different IP addresses.

In the diagram below, our two-node cluster now has two SCAN listeners. The end user makes a connection request to my-scan.acme.com and DNS resolves the name to either 192.168.1.21 or 192.168.1.22.

As is shown above, those two SCAN VIPs are assigned to different nodes in the cluster. If NODE1 goes down, Grid Infrastructure will relocated both NODE1-VIP and MY-SCAN (192.168.1.21) to a surviving node in the cluster, through the same re-ARP operation we talked about earlier. The newer SCAN listeners and their VIPs are handled the same way as the old-style VIPs.

To recap, Virtual IP addresses are used to provide quicker failover of network communications between the application and the nodes in the cluster. The OS uses Address Resolution Protocol to let the network switch know to route connections to host. Grid Infrastructure users the same ARP operations to let the network switch know where to route traffic for the VIP and the SCAN VIP. Should a node go down, GI will re-ARP the VIP and SCAN VIP to another node in the cluster.

↧

Fog Computing

July 13, 2016, 2:12 pm

≫ Next: July 2016 PSU fails to make isqora

≪ Previous: Oracle RAC VIP and ARP Primer

Unless you’ve been living under a rock, you know that everything is about the Cloud now. At least that’s the hype vendors and the media want us to believe. Earlier this week, I was clued into a new term. Just as people are working towards Cloud Computing comes the newer Fog Computing.

So what is Fog Computing? Basically, its cloud computing that has been pushed to the data sources. In Cloud Computing, your devices and applications push the data into the proverbial Cloud. The cloud is a centralized location. The problem Fog Computing aims to address is with all of this data being pushed into the cloud, that centralized cloud compute power may be inundated with data that it has to process. Latency may be too great. Fog computing pushes processing back to the data source. Data is processed first and then what is hopefully a smaller amount of data is sent up to the cloud. Fog computing is very important when talking about the Internet of Things. Instead of many, many devices sending all that data back to one spot to be processed, only the processed data is forwarded.

When I first read about this concept, my immediate thought was how the more things change, the more they stay the same in our cyclical nature of IT. I’m old enough that in my early days in the IT field, I worked on mainframe systems. I remember this fancy new computing model that was going to take the world by storm called “Client/Server”. Mainframes couldn’t keep up with the ever-increasing demands of the day’s applications. So instead of processing all of the data in a central location, we’ll move the processing to the client side. Scalability is easier because we handle a larger workload by deploying more clients. The Internet and web computing changed things a bit because the browser became a thin client to centralized web/application servers, but we still had scalability by deploying multiple web servers behind a load balancer to make them appear as a singular unit. Cloud computing was going to be the newest version of sliced bread. Just push your processing to the cloud and you’ll sleep much better at night. But now that some in the industry are seeing some of the pitfalls to this approach, we’re pushing processing back to the edges again.

One day, you’ll hear the media and vendors espousing the benefits of fog computing and how those very vendors are there to help you handle your transition to the new promised land. And then along will come another way to do business. It goes in cycles. Mainframe (centralized) to Client/Server (distributed) to Cloud (centralized) to Fog (distributed) to ??????

On another note, I had an editor for some database-related outlet ask me to write some articles about databases in the cloud. The editor said that “cloud is THE hot topic” as we all know. Well I can only write so much about the cloud because I don’t have a ton of experience with it. My company has our internal private cloud implementation, but to be honest, I have to take a lot of liberties with the term “cloud” to call it a private cloud. When I told the editor I wasn’t sure how much I could contribute, they told me they were not finding enough experts to talk about the cloud because “most people I talk to haven’t moved to the cloud.” This begs the question, is it just the media and vendors espousing the cloud?

↧

July 2016 PSU fails to make isqora

July 21, 2016, 12:36 pm

≫ Next: opatch prereq

≪ Previous: Fog Computing

When applying the latest PSU, I recieved the following errors from my “opatch apply” session:

Patching component oracle.odbc.ic, 12.1.0.2.0...

Make failed to invoke "/usr/bin/make -f ins_odbc.mk isqora 
   ORACLE_HOME=/u01/app/oracle/product/12.1.0.2"....'/usr/bin/ld: cannot find -lodbcinst

collect2: error: ld returned 1 exit status

make: *** [/u01/app/oracle/product/12.1.0.2/odbc/lib/libsqora.so.12.1] Error 1

The following make actions have failed :

Re-link fails on target "isqora".

Composite patch 23054246 successfully applied.

OPatch Session completed with warnings.

Log file location: /u01/app/oracle/product/12.1.0.2/cfgtoollogs/opatch/opatch2016-07-20_23-35-27PM_1.log

OPatch completed with warnings.

The patch was applied successfully, but the relinking did not work correctly. To fix this, I did the following:

cp $ORACLE_HOME/lib/libsqora.so.12.1 $ORACLE_HOME/odbc/lib/.

relink all

That’s all there was to it.

I did the copy and relink steps because I was trying to fix the error from OPatch. A better way to handle this is to do the copy first, then run ‘opatch apply’ and you won’t get any errors at all.

I see that Bug 24332805 was posted for this issue, but I’m not privileged enough to see the contents of that bug.

↧