Yups..semua orang pasti pernah ngerasai ‘enak’ nya lebaran dikantor, entah loyalitas, terpaksa, lemburan yang berlimpah, ato banyak alasan lah.
kali ini setelah menunaikan sholat ied, dan bersimpuh (terlalu berlebihan kayanya deh) di hadapan ortu, gue langsung cabut ke kantor.
Here’s the problem summary
First we have issue about Undo_Retention (undo_retention before = 129600),
yes the Undo retention was set improper, this keep the query for a long-long time, this fix with alter system set undo_retention=14000, and then alter system set query_rewrite_enabled=false scope=both sid='*',
Second we get error-code message from the client who’s connect to the databases (ORA-03113 end-of-file on communication channel), this cause the connection to server was break see problem next(third),
Third the database was slow response and you know what, like on schedule, the slow response was start at 6 am to 10 am, probably issue I/O contention and delay commit on database and the application.
After discuss with team application and running the statpack, known was we have very-very high traffic (probably lebaran days), solve for while was stop the backup script, rename and truncate the table (the table was B*tree index, hu..uh),
for a while response to from application to database was come back normally and the database was back in good performance, this solve..not quite.
Today i have the same problem, confuse, of course..
i think maybe the memory was limit, cause (again) high traffic but handle only one node (first design RAC), so i change (again) the pga parameter cause the ratio pga was 54% , for the optimum ratio i change to 1.5G (backup first –create pfile ='/app/oracle/product/9.2.0/dbs/init_141007.ora' from spfile–), alter system set pga_aggregate_target=1500M scope=both sid='*' and then restart.
was not impact, hu..uh..
I see the log buffer parameter, and then I see the log_checkpoint_interval was set improper (1410065407) , this impact to delay checkpoint, so for the faster checkpoint with no delay,
I change to default “0″ with alter system set log_checkpoint_interval=0 scope=both id='*'
and you know what, the response queue from application to database was back to normal.
Know i must to distribute/split the busy datafile to other mountpoint and change from raid 5 to raid 1+0…Puih
who’s design this project..hu..uh.