Oracle-en

AWR reports in Oracle databases

Oracle AWR (Automatic Workload Repository) reports are a diagnostic tool available in Oracle databases that helps analyze system performance and identify areas requiring optimization. At first glance they look, let’s be honest, quite complicated. Lots of sections, lots of numbers and parameters. Due to their detailed and technical nature, they require some experience in use. However, after a deeper analysis, it turns out that it is an extremely valuable tool for database administrators and the fact that it allows access to archived database statuses (typically up to 8 days ago) makes it even more useful.

Maintaining a minimum eight-day retention period is a very well-thought-out strategy because it gives insight into the same period from a week ago. In other words, if you experienced a performance issue today, it might be worth checking if a similar event occurred a week ago. Perhaps the problem is cyclical and is related, for example, to the automatic launch of tasks. If you feel that this period is still too short, increase it using the following procedure, remembering to reserve more space in the SYSAUX tablespace:

exec dbms_workload_repository.modify_snapshot_settings(interval=>15*60,retention=>30*24*60);

In this example, the retention period is set to 30 days (43200 min) and the interval between each snapshot is 60 min.

As Active Session History views are used to create the report, a Diagnostic Pack license is required. If you don’t have it, you can use the Statspack package, which unfortunately does not provide such automation as AWR reports.

Oracle AWR reports

Generating a report

To generate the report, run the script awrrpt.sql located in the directory $ORACLE_HOME/rdbms/admin

cd $ORACLE_HOME/rdbms/admin

sqlplus / as sysdba

SQL> @awrrpt.sql

After running, decide whether you prefer the result in the form of text or HTML structure, and indicate the range of snapshots you want to analyze (time interval).

Enter value for num_days: 1
Listing the last day's Completed Snapshots
Instance     DB Name      Snap Id       Snap Started    Snap Level
------------ ------------ ---------- ------------------ ----------

cdblab01    CDBDBLAB01        18330  20 Sep 2023 00:00    1
                              18331  20 Sep 2023 01:00    1
                              18332  20 Sep 2023 02:00    1
                              18333  20 Sep 2023 03:00    1
                              18334  20 Sep 2023 04:00    1
                              18335  20 Sep 2023 05:00    1
                              18336  20 Sep 2023 06:00    1
                              18337  20 Sep 2023 07:00    1
                              [...]
                              


Specify the Begin and End Snapshot Ids
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Enter value for begin_snap: 18336
Begin Snapshot Id specified: 18336

Enter value for end_snap: 18337
End   Snapshot Id specified: 18337

Once the data download is complete, the report will be displayed on the screen. The first section contains information about the database name, host and, optionally, PDB containers contained in the database.

DB Name         DB Id    Unique Name DB Role          Edition Release    RAC CDB
------------ ----------- ----------- ---------------- ------- ---------- --- ---
CDBDBLAB01     3446739   cdblab01    PRIMARY          EE      19.0.0.0.0  NO YES

Instance     Inst Num    Startup Time       User Name    System Data Visible
------------ --------    ---------------    ------------ --------------------
cdblab01            1    24-Jun-23 18:04    SYS          YES

Container DB Id  Container Name       Open Time
--------------- --------------- ---------------
       79466073   PDB$SEED      24-Jun-23 18:04
      544672419   CDB$ROOT      24-Jun-23 18:04
     3580336812   PDBLABDB      24-Jun-23 18:04

Host Name        Platform                         CPUs Cores Sockets Memory(GB)
---------------- -------------------------------- ---- ----- ------- ----------
Aixlab100       AIX-Based Systems (64-bit)         64     8             216.00

In the next section, we can see the execution times of the analyzed snapshots, the number of sessions and the average number of cursors in sessions:

Snap Id      Snap Time                    Sessions Curs/Sess  PDBs
            --------- ------------------- -------- --------- -----
Begin Snap:     18336 20-Sep-23 06:00:45       361       5.4     3
  End Snap:     18337 20-Sep-23 07:00:51       330       5.2     3
   Elapsed:               60.10 (mins)
   DB Time:            1,106.98 (mins)

At this point, please pay attention to the value of the DB Time metric. This is a statistic that is the sum of the time consumed by all Oracle database processes and the sum of waiting times.

Already at this stage we can confidently draw conclusions. By comparing the elapsed time between snapshots (elapsed time) with the DB Time value describing the database activity, we can conclude that if the DB Time value is much larger than the elapsed time value, some sessions were waiting for resources to be released.
In this case, even though we are analyzing a 60-minute interval, the DB Time value shows over 1000 minutes. This may mean that the analyzed database was busy or even overloaded at that time.

Another section

Another section that gives you an overall idea of performance is Instance Efficiency Percentages. It contains the so-called “hit rates”, as well as the percentage of the proportion of queries executed to parsed (Execute to Parse). As you can see from the suggestion (Target 100%), the ideal parameter target value is 100%.

This section shows that almost 99% of the parsing is soft in nature, as indicated by the Soft Parse % metric. This is a very good result, which means that the Oracle database is reusing previously prepared SQL query plans. As a general rule, you should always minimize the number of hard analyzes in your production database. This reduction provides the benefit of minimizing CPU load when performing costly analyses.

The next interesting metric is Parse CPU to Parse Elapsd (ratio of parse time to CPU time). In this case it is about 22%. This fact shows that the database was waiting for some resources while parsing the SQL statement. I know from experience that this value often takes small values. In this case, it does not indicate any significant irregularities.

Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            Buffer Nowait %:   99.99       Redo NoWait %:  100.00
            Buffer  Hit   %:   97.67    In-memory Sort %:  100.00
            Library Hit   %:   99.73        Soft Parse %:   98.76
         Execute to Parse %:   91.18         Latch Hit %:   99.87
Parse CPU to Parse Elapsd %:   22.18     % Non-Parse CPU:   99.62
          Flash Cache Hit %:    0.00

Let’s move on to the next section of the report, namely Top 10 Foreground Events. It contains information about the areas where the most resources were occupied.

The following example illustrates that the most resources were used when searching the db file sequential read indexes. In a loaded system supporting a large number of transactions (such as this one), a high value of this metric is a normal situation, but e.g. large values in SQL*Net message from dblink may indicate that the database was searching data available in external resources and most likely encountered a performance problem networks.

Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                           Total Wait       Avg   % DB Wait
Event                                Waits Time (sec)      Wait   time Class
------------------------------ ----------- ---------- --------- ------ --------
DB CPU                                          11.4K             17.2
db file sequential read         10,769,892     6999.4  649.90us   10.5 User I/O
direct path read                 1,126,591     5149.2    4.57ms    7.8 User I/O
db file scattered read           1,148,518       3498    3.05ms    5.3 User I/O
direct path read temp            2,463,087     2757.3    1.12ms    4.2 User I/O
SQL*Net message from dblink        494,308     2334.2    4.72ms    3.5 Network
SQL*Net more data from dblink    2,023,625     2037.2    1.01ms    3.1 Network
direct path write temp             930,859     1428.7    1.53ms    2.2 User I/O
latch free                       1,974,192     1083.2  548.67us    1.6 Other
control file sequential read     1,753,380        707  403.20us    1.1 System I

Base load profile (Load Profile)

This section provides detailed data on the types of database operations that were performed and how much resources they consumed. What is important here is the frequency of I/O operations (Physical reads / writes) and hard parses. If you notice large values in the hard parses section, it may mean that your queries are not using bind variables. In case of such problems with production systems, you should report this fact to the project team.

A broader explanation is needed here: what are bound variables?
The following example illustrates this. Suppose we run the following four queries:

SELECT * FROM EMPLOYEES WHERE EMPLOYEE_ID=10;
SELECT * FROM EMPLOYEES WHERE EMPLOYEE_ID=15;
SELECT * FROM EMPLOYEES WHERE EMPLOYEE_ID=19;
SELECT * FROM EMPLOYEES WHERE EMPLOYEE_ID=20;

The database will develop a separate plan for each query. This situation is not optimal from the performance point of view, so to protect against it we should use the previously mentioned binding variables.

Let’s create a variable like this:

variable x number;

and let’s assign a value to it:

begin
 :x:=10;
end;

Now we can refer to this variable in queries using the syntax:

SELECT * FROM EMPLOYEES WHERE EMPLOYEE_ID=:X;

We can query the database multiple times using the above query, each time providing a different variable value. From the optimizer’s point of view, it will still be the same query. To put it simply, the cost optimizer will use the same execution plan, regardless of the value of the bind variable. When we delve deeper into the topic, it turns out that a set of several plans will be prepared, but I described this mechanism in more detail in a separate article: Adaptive Cursor Sharing.

Returning to further analysis of Oracle AWR reports: If, when analyzing the “Load Profile” section, you find a high frequency of User Logons, it may mean that the application creates a new connection each time, i.e. executes a query and disconnects the session. This situation, made with many clients running simultaneously, can potentially impact the load on the database system.

Load Profile                    Per Second   Per Transaction  Per Exec  Per Call
~~~~~~~~~~~~~~~            ---------------   --------------- --------- ---------
             DB Time(s):              18.4               0.9      0.01      0.03
              DB CPU(s):               3.2               0.2      0.00      0.01
      Background CPU(s):               0.1               0.0      0.00      0.00
      Redo size (bytes):       9,658,868.4         445,126.1
  Logical read (blocks):         971,544.6          44,773.4
          Block changes:           1,680.4              77.4
 Physical read (blocks):         178,511.2           8,226.6
Physical write (blocks):          14,240.7             656.3
       Read IO requests:           7,355.6             339.0
      Write IO requests:           1,348.7              62.2
           Read IO (MB):           2,789.2             128.5
          Write IO (MB):             222.5              10.3
           IM scan rows:               0.0               0.0
Session Logical Read IM:               0.0               0.0
             User calls:             594.7              27.4
           Parses (SQL):             178.8               8.2
      Hard parses (SQL):               2.2               0.1
     SQL Work Area (MB):              58.9               2.7
                 Logons:               7.7               0.4
            User logons:               7.1               0.3
         Executes (SQL):           2,026.5              93.4
              Rollbacks:               8.4               0.4
           Transactions:              21.7

Another section worth your attention is Top 10 Foreground Events by Total Wait Time. It contains information about the events responsible for the longest waits.

Top 10 Foreground Events by Total Wait Time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                           Total Wait       Avg   % DB Wait
Event                                Waits Time (sec)      Wait   time Class
------------------------------ ----------- ---------- --------- ------ --------
DB CPU                                          11.4K             17.2
db file sequential read         10,769,892     6999.4  649.90us   10.5 User I/O
direct path read                 1,126,591     5149.2    4.57ms    7.8 User I/O
db file scattered read           1,148,518       3498    3.05ms    5.3 User I/O
direct path read temp            2,463,087     2757.3    1.12ms    4.2 User I/O
SQL*Net message from dblink        494,308     2334.2    4.72ms    3.5 Network
SQL*Net more data from dblink    2,023,625     2037.2    1.01ms    3.1 Network
direct path write temp             930,859     1428.7    1.53ms    2.2 User I/O
latch free                       1,974,192     1083.2  548.67us    1.6 Other
control file sequential read     1,753,380        707  403.20us    1.1 System I

This part of the report indicates the potential source of database problems, with the first two items usually responsible for most performance problems.

In the case of a properly functioning database, events related to the processor and disks should be at the beginning of the list, as in our case. However, if there are events from the enq class: TX – row lock contention or latch free, they will usually mean competition for resources.

Time Model Statistics

Time model statistics provide insight into where processing time is actually spent in the snapshot interval. If the hard parse elapsed time is very long, you need to investigate this issue further.

Time Model Statistics         DB/Inst: CDBDBLAB01/cdblab01  Snaps: 18336-18337
-> DB Time represents total time in user calls
-> DB CPU represents CPU time of foreground processes
-> Total CPU Time represents foreground and background processes
-> Statistics including the word "background" measure background process 
   time, therefore do not contribute to the DB time statistic
-> Ordered by % of DB time in descending order, followed by Statistic Name

                                                                % of  % of Total
Statistic Name                                       Time (s) DB Time   CPU Time
------------------------------------------ ------------------ ------- ----------
sql execute elapsed time                             65,734.7    99.0
DB CPU                                               11,415.6    17.2       97.6
PL/SQL execution elapsed time                         1,165.2     1.8
inbound PL/SQL rpc elapsed time                         482.5      .7
connection management call elapsed time                 372.9      .6
parse time elapsed                                      192.3      .3
hard parse elapsed time                                 121.9      .2
RMAN cpu time (backup/restore)                           33.6      .1         .3
hard parse (sharing criteria) elapsed time               11.9      .0
PL/SQL compilation elapsed time                           1.9      .0
repeated bind elapsed time                                1.8      .0
sequence load elapsed time                                0.8      .0
failed parse elapsed time                                 0.5      .0
hard parse (bind mismatch) elapsed time                   0.4      .0
DB time                                              66,418.6
background elapsed time                               2,429.7
background cpu time                                     280.0                2.4
total CPU time                                       11,695.6
                          ------------------------------------------------------

Longest Executing SQL Queries (Top SQL Statements)
This section of the report lists the longest-executing SQL statements. Here you can detect specific problematic queries and use their ID number to analyze their plans.

Top SQL with Top Events       DB/Inst: CDBDBLAB01/cdblab01  Snaps: 18336-18337
-> Top SQL statements by DB Time along with the top events by DB Time for 
   those SQLs.
-> % Activity is the percentage of DB Time due to the SQL.
-> % Event is the percentage of DB Time due to the event that the SQL is 
   waiting on.
-> % Row Source is the percentage of DB Time due to the row source for the 
   SQL waiting on the event.
-> Executions is the number of executions of the SQL that were sampled in 
   ASH.

                 SQL ID            Plan Hash           Executions     % Activity
----------------------- -------------------- -------------------- --------------
                                                                           % Row
Event                          % Event Top Row Source                     Source
------------------------------ ------- --------------------------------- -------
                              Container Name
 -------------------------------------------
          22drvv11nb5r6           5125497296                    1           4.95
CPU + Wait for CPU                4.93 HASH - GROUP BY                      2.66
 select t.numer nr ,a.numer ponu ,case when o.punu i
 then case when b.NUMER_KL in (select * from d

                                    PDBLABDB

          2685axh6avbvc            758157989                    1           4.95
CPU + Wait for CPU                3.73 HASH - GROUP BY                      1.33
create table TBL_INMP TABLESPACE USERS_A as select
 source, id, od_d, dt_zg, paczka, 

The article does not exhaust the full possibilities offered by the analysis of Oracle AWR reports. However, it can be an introduction to the search for problematic areas in the operation of database systems.

Although AWR reports may seem confusing at first, they are an indispensable tool for database administrators, helping them keep their environment in top condition and minimize the risk of performance problems.