Guys, how are you?
At times, I started an ETL project that is in the conclusion phase.
Last full load process, involves the join of 5 giant tables. Then I decided to do Joins for phases.
Last stage, involves a join of two tables that I do not understand why it is taking absurdly, other phases involved tables with greater mass of data and no longer than 4 hours.
This process is only 24 hours running and has not finished.
follows data:
DBM_3Cartao - More than 200 million lines
Customer_CARD - Around 90 millions of lines
Important points:
- Join is made by Vactho2 Field that has 19 characters (this may cause lenting in Join)
- Index this created in the Customer_Card table in the CardNBR field, but performing "execution plan", still yes it appears to be running Full Scan (I do not know how to report if statistics are updated, I do not administer the bank here)
- Verified and there are no Locks
- Verified machine manager and is within expected, not performing SWAAP and compliant processing.
- I am using "Parallelism 2", I used the other form in the other processing and had an effect.
doubts
Do you judge this acceptable delay? If negative, what can be done for improvement?
NVL applied can impact this delay?
can be something related to parallelism?
repeat, previous processes had greater masses of data, but it did not take it all.
Performance Insert with SELECT
- stcoutinho
- Moderador
- Posts: 850
- Joined: Wed, 11 May 2011 5:15 pm
- Location: são Paulo - SP
Hi GPeira,
I could not give you a complete diagnosis for your case.
Just wanted to do some suggestions, more to call other forists (more experienced in perfomance) for this discussion.
Your insert makes the charge of 200 millions of lines, as you are making an outer-join between DBM_3CARTAO and CUSTOMER_CARD. Does this not take time to be processed?
If you tried the following:
1) Transform the TMP_DBM_4Client table into logging;
2) Change the structure of the target table (TMP_DBM_4Client), adding the NR_Cartao column;
3) Create an index in tmp_dbm_4client by nr_cartao;
4) Disable this index in charge (Skip Unused Indexes);
5) Insert / * + Append * / DBM_3Cartao data in TMP_DBM_4Client; (time here the time for him to do this insert without the outer join .. Comparing this time with complete processing, you have how to evaluate the when Join is increasing this time);
6) Rebuild of the index;
7) Run a Cursor + Collection, to make Update in TMP_DBM_4Cliente from the NVL columns (CA.CustomerProspectID, -2) and NVL (CA.AccountGroupNBR, -2). You would make a command forall .. Update ..;
Would you have any performance gain?
I do not think NVL is impacting on your case, but you can take a test by removing this function from your insert and evaluating if there was any significant gain.
I believe that forists may submit other suggestions;
Hugs,
Sergio Coutinho
would have any
99]]
DBM_3Cartao - More than 200 million lines
CUSTOMER_CARD - Around 90 millions of lines
I could not give you a complete diagnosis for your case.
Just wanted to do some suggestions, more to call other forists (more experienced in perfomance) for this discussion.
Your insert makes the charge of 200 millions of lines, as you are making an outer-join between DBM_3CARTAO and CUSTOMER_CARD. Does this not take time to be processed?
If you tried the following:
1) Transform the TMP_DBM_4Client table into logging;
2) Change the structure of the target table (TMP_DBM_4Client), adding the NR_Cartao column;
3) Create an index in tmp_dbm_4client by nr_cartao;
4) Disable this index in charge (Skip Unused Indexes);
5) Insert / * + Append * / DBM_3Cartao data in TMP_DBM_4Client; (time here the time for him to do this insert without the outer join .. Comparing this time with complete processing, you have how to evaluate the when Join is increasing this time);
6) Rebuild of the index;
7) Run a Cursor + Collection, to make Update in TMP_DBM_4Cliente from the NVL columns (CA.CustomerProspectID, -2) and NVL (CA.AccountGroupNBR, -2). You would make a command forall .. Update ..;
Would you have any performance gain?
I do not think NVL is impacting on your case, but you can take a test by removing this function from your insert and evaluating if there was any significant gain.
I believe that forists may submit other suggestions;
Hugs,
Sergio Coutinho
would have any
99]]
DBM_3Cartao - More than 200 million lines
CUSTOMER_CARD - Around 90 millions of lines
- fbifabio
- Moderador
- Posts: 199
- Joined: Tue, 22 Feb 2011 1:51 pm
- Location: São Paulo - SP
- Contact:
Fábio Prado
www.fabioprado.net
www.fabioprado.net
GPeira,
The stcoutinho gave great response. To complement below 3 SQL tuning tips and that I am sure that will help you optimize the performance of this query:
1- Create the DBM.TMP_DBM_4Client table as GTT (Global Temporary Table)
2- Optimize SELECT Creating for example an index in the relationship FK and / or replacing the Outer Join by 2 Union connected instructions with Inner Join
3- Insert with direct load using the Hint Append ( Search for the restrictions of this command).
Note: I only gave 3 tips, but much more things could be performed with longer dedication time by analyzing the SQL statement and the environment in which it is being executed.
The stcoutinho gave great response. To complement below 3 SQL tuning tips and that I am sure that will help you optimize the performance of this query:
1- Create the DBM.TMP_DBM_4Client table as GTT (Global Temporary Table)
2- Optimize SELECT Creating for example an index in the relationship FK and / or replacing the Outer Join by 2 Union connected instructions with Inner Join
3- Insert with direct load using the Hint Append ( Search for the restrictions of this command).
Note: I only gave 3 tips, but much more things could be performed with longer dedication time by analyzing the SQL statement and the environment in which it is being executed.
-
- Information
-
Who is online
Users browsing this forum: No registered users and 15 guests