AboutPeter Choi Expertise I am a senior Oracle DBA, PeopleSoft Administrator and Project Manager with
10+ years experience. I have been working with PeopleSoft (HRMS 5, 7, 7.x and
8.9), Oracle RDBMS (7.3 - 11gR1) on various Unix and Windows platforms, and some Oracle Application Server (9i/10gR2). I also have experience with the configuration and administration of BEA`s Tuxedo and WebLogic for PeopleSoft 8.x.
I need your input on a decision. We have a 10g db running on Linux and want to do selects of say 500 000 - a million rows (will possibly grow in the future). Many of our apps are going to do this during the day for statistic purposes. They will select the same data.
I addition to this, several processes will write the same amount of data, but to other tables.
My question is, do we need to do some sort of RAC or replication designing? Or maybe just do some partitioning of tables?
What do you think?
Thank you so much for helping!
Answer Hi Julia,
Oracle's Real Application Cluster (RAC) provides higger availability and allow for multiple nodes to be connected to single database. Your bottle neck will be at your database; more specifically, it will be the inserting -> writing to disk storage / selecting -> retrieving from disk storage.
Do you have queries that select the same data set (e.g. select data from the last three months)? If you do, this data can be cached in memory if you have enough memory to store all this data. If not, it will be popped out of the cache.
So there are two questions that we will need to address. The first being SELECTING large volume of data (I'll assume that large volume of different data will be selected as this is a worst case scenario).
I'll assume that you've tuned and optimized the query as well as building the appropriate index(es) for the query to use. Partition can help in this case (more on this later).
The second question is in writing (or inserting) data to the database. Your data is written during the day. The question is: are these tables being accessed during the day by users or other reports/queries? If no, then you can speed up the insert by dropping your index on the table(s), load your data, then rebuild your index. When you insert data, Oracle inserts the rows of data sequentially. It's slow if Oracle has to load data, build the index, balance the index, load the next data.
Another factor that can slow your database insert, is archivelogs! How? If you have archivelogmode enabled, for ever insert into the database, Oracle will generate an archive log (for undo/rollback). Unless your load process is restartable, if it fails, Oracle will need to rollback the transactions. Another thing that you can do is to turnoff archivelogmode. You can only do this if you do not need rollback or no other users are using the online system. This is best done in the off hours with no users online.
Once data has been inserted, make sure you update your statistics so that the Oracle optimizer can use this information to improve query selection.
Partitioning doesn't help with the load process. It helps with the select queries (limited which set of data that you need to retrieve). Generally, partitions are created along chronological time lines (hours, days, months, years, etc.) In a manufacturing plant, you can also partition along product lines. It all depends on your business requirements.
It should be noted that Partitioning is only available with Oracle RDBMS Enterprise Edition and is an additional option for purchase.