SQL performance tunning and DBA thread

Lendarios · Dec 16, 2012

Ill use this thread to answer and to gather information about how to tune in your sql database. Feel free to ask question about tables design, index queries, etc.

http://msdn.microsoft.com/en-us/magazine/cc135978.aspx

This link contains how to optimize your database and tables, novice to medium level.

For those who like SQL performance and tuning here is a good presentation about database performance and indexing. This is a bit medium to advanced stuff. But I'm sure someone can benefit from it.

http://performance.sqlpass.org/2012Palooza.aspx

fred sanford · Dec 17, 2012

I've got a general DBA career question. Once upon a time I did server administration work which included DBA work with both Oracle and SQL Server, about 2.5 years worth. Three years ago I started doing IT project management and I've come to the conclusion that I don't prefer PM work and miss the technical work. I'd like to get back into DBA work if possible, although not having done it in recent years will probably get me a lot of rejections. I was wondering if I should work on both platforms or specialize in one or the other? For Oracle I would definitely need to brush up on Linux/Unix. Between the two I'm more familiar with SQL Server and it's BI tools (SSIS, SSRS, SSAS) and play around with them regularly at home.

Any advice?

Lendarios · Dec 17, 2012

There is a good market for a DBAs with strong BI knowledge, SSIS, SSAS,SSRS. If you don't have the recent work experience try supplementing that with certifications. Also the most important part of a dba is the ability to write medium to complex queries, so make sure yout t-sql skills are good.
For example our DBA work mostly consist of automating tasks, such as backup jobs, index analysis, query writing for reports.

foop_sl · Dec 30, 2012

It really depends of your workplace. DBA are allowed to run only scripts provided by developers, on the other side, they only know doing that.
Two books that I recommend heartily for anyone interested in performance tuning/sql server are :
SQL Server 2008 Query Performance Tuning Distilled and Microsoft? SQL Server? 2008 Internals. These two books are basically the best books on SQL Server, the blogs of the authors are also a goldmine, just missing brent ozar blog and you are set for extensive SQL Server knowledge.

Sir Funk · Jan 3, 2013

My job as a reporting analyst is quickly turning into a DBA job. I have no idea wtf I'm doing and every time I start trying to learn I get bored to tears.

Goddammit I hate having to man up.

Sorry for contributing nothing, this thread just made me want to vent

Deruvian · Jan 3, 2013

I'm in a simlar situation, I love the database work, though. Good reporting and serious competancy in SQL have to go hand in hand.

Here are a few questions while I'm here:
In MS Access, does using subqueries confuse the JET query optimizer?
Is there any general size threshold for deciding whether to index a table?
If I am using a cross join to scale a 100k row table to ~ 10m rows, would indexes approve the speed of these calculations? would indexes on the smaller tables that are being x joined cause an increase in speed?

Another one is something I've noticed that I'd like more clarity on. If I have a small dataset constructed with constraints on date and other fields and I inner join this with a large order table on a FK, I have noticed a large performance increase in placing where constraints which duplicate the constraints on the small table. I would have thought that the join alone would not need these extra constraints. Given the increase I have seen, is it worth fully duplicating on the large table the constraints existing in the smaller table?

Sorry if this reads as clustered, I made this post on my phone.

Lendarios · Jan 4, 2013

All this question pertain to MS sql.

Every table should be indexed with more than one index. The only downside of indexing a table is that updates and inserts are a bit slower because more structures have to be maintained, but unless we are doing hundreds of updates/inserts per second, you will not notice any performance impact on your application.

How does an index work?

To know how does an index work, you first have to see how the data is stored in sql. In sql the data is store in a file, by tables /row. This means that all the information for one table is placed sequentially on the file, ordered by the CLUSTERED INDEX. This means the Clustered Index gives you a sequence of data.
For example the table
Customer_ID , Customer_Name
1 Alex
2 Michael
3 Kevin
4 Smith
5 John
6 Peter
7 Zen

This will be on the file as 1 Alex, 2 Michael, 3 Kevin, 4 Smith, 5 John, 6 Peter, 7 Zen

If we create a NON clustered index on Customer name the tree structure will look like
Now look at the query
Select * from Customer where Customer_Name = 'Peter'

If you only have a clustered index, Sql has to look at the whole table in order to find the records that match the where clause. This is called a table scan, and is extremely expensive, reason been is that sql has to transverse the whole table every time.

Check it out on the clustered index sql has to load record 1 and compare Alex with Peter, no good, next row, Alex with Michael, no good, next row; even if it gets to row 6, and find it, it still have to keep going and load record 7 and compare Zen with Alex. This is because more than one row can match your where clause. Btw every record that gets compared is a logical read, so we are doing for this table 6 logical reads for every time we want to find a record by the customer name

Now with the NON clustered index

Sql start at the first node (Michael) and compares Michael with Peter, comparison negative, then goes to the node with the proper direction, peter is to the right, so it ignores everything to the left of the node "michael". In other words Sql will ignore nodes kevin Alex and John. Sql loads next node Peter and finds a match, not only it finds a match but it doesn't need to go further down as duplicate record will appear at the same node level.
So how many logical reads is that, for michael it was just 2 reads. And you can guarantee that the amount of reads is exponentially smaller and it is related logarithmically to the table size. This mean the table can be in the millions of row, but the level of the index can be in the hundreds. This search process called a index Seek and it is what you should strive for.

On the picture i used a Binary tree that only has two branches, sql uses a B tree structure that has more than 2 nodes per level, increasing the efficiency

TLDR, yes add indexes on every table after looking at the queries that are run against it.

For the other question are you asking froma code perpective or sql perpective. The word DAtaset is a programming word, and a dataset should be avoided. It is very verr inefficient compare to Sql readers.

foop_sl · Jan 5, 2013

Just two lil things, if you are using ms access and have 10m rows, you will be in a world of troubles.

Clustered index is actually what you need at bare minimum on each table other wise your tables won't be logically structured (how to locate something fast if you have a big heap? ).
What you want to avoid are a few things : any kind of scan, clustered index scan ( which basically equals to a table scan as the leaf node of a CI is your data page), nonclustered scan or table scan.

Nonclustered index leaf nodes either contains a row locator or the clustered index key for the row depending if the table is a heap or not.

The clustered index doesn't have to be put on the PK. There are a lot of pitfalls and misconceptions regarding db design, indexing etc.
Always use the book online when looking for information ( commonly referenced as BOL )

Deruvian · Jan 5, 2013

Interesting stuff on indexing. So knowing this, would an index on small table PK join on large table FK cause significant speed gains?

I suppose my other question could be posted as such:

Why are is the bolded line of SQL below giving me such gains in query speed? It seems like the logic is redundant as it is already in the first piece of the sub query.

Lendarios · Jan 5, 2013

index the column FK on the big table, the small table PK is already indexed for your needs. On Ms sql the PK is implmented as a unique clustured index by default.

Also if you are on Ms sql use "included columns" on your non clustered indexes.

is that query on sql or access. on sql the query optimizer will determine the most optimal path to execution. Look at the query plan and post the xml plan so we can see it.

Deruvian · Jan 5, 2013

Query is in Access.

foop_sl · Jan 6, 2013

look athttp://msdn.microsoft.com/en-us/library/ms189499.aspxfor the select order "Logical Processing Order of the SELECT statement".

Basically you are joining two different "dataset" before applying the join condition, and when adding an additional filter, you'll get less data to join, that's an assumption, and in database, assumptions are just that.

Why are you using access in the first place? it's not really a database tool.

Deruvian · Jan 6, 2013

I work as an analyst in my companies finance department. I think that my managers are afraid of any products that are not in Microsoft Office suit; Excel functions are about the limit of their IT prowess. Although there are a number of deficiencies in the SQL functionality that I have access to, I can make up some ground within VBA. Further, Access has great interoperability with Excel, which 95% of my data needs to be translated to for my managers to make use of it. Also, nearly all of the database tables that I query from are in SQL Server and are accessed through ODBC.

Honestly, I couldn't have even told you what SQL was or what it was used for 12 months ago. I've really enjoyed the process of learning this stuff and am kicking myself for not going into IT from the onset.

Sir Funk · Jan 6, 2013

Prova_sl said:
I work as an analyst in my companies finance department. I think that my managers are afraid of any products that are not in Microsoft Office suit; Excel functions are about the limit of their IT prowess. Although there are a number of deficiencies in the SQL functionality that I have access to, I can make up some ground within VBA. Further, Access has great interoperability with Excel, which 95% of my data needs to be translated to for my managers to make use of it. Also, nearly all of the database tables that I query from are in SQL Server and are accessed through ODBC.

Honestly, I couldn't have even told you what SQL was or what it was used for 12 months ago. I've really enjoyed the process of learning this stuff and am kicking myself for not going into IT from the onset.

This is the exact same setup at my company and it's just dumb and the main reason why I won't work for a giant multinational fortune 50 company ever again. Too large means too slow, and the benefits aren't even good enough to make up for it.

Everything has to be Microsoft and unauthorized programs are simply just not allowed and installation is met with disciplinary action.

But yeah, everything I do is in Access or one-offs in SQL Server Management Studio. Makes life "interesting"

lofwyrx_sl · Jan 14, 2013

Are you familiar with hold em manager sql database ? i have a really huge database and its getting very slow, is there any way to make it faster ?

Lendarios · Jan 15, 2013

plenty of ways. A database run slower because of bad queries and lack of indexing. That is the main reason, the second reason is bad data schema design.

Open sql profiler, open the tuning template and run a profile during the times you feel is slow. Make sure you include reads and writes, then once you have gather the information youcan see what queries are destryoing your system.

Size has little to do with slowness. It is either server resources, using the database to perform business functions and sql functions, or bad indexing.

Avoid sql functions, cursors and triggers, in that order.

foop_sl · Jan 20, 2013

What do you call huge database? what kind of server do you have?

Etruscus_sl · Jan 23, 2013

lofwyrx_sl said:
Are you familiar with hold em manager sql database ? i have a really huge database and its getting very slow, is there any way to make it faster ?

Fairly certain hold em uses postgres.

In this case, a few things are probably happening to you:

1. You are using the stock configs.
2. You might not be running vacuum / analyze enough.
3. You don't have enough RAM to store the indexes in memory, causing the optimizer to choose bad query routes or doing on disk seeks within the indexes.
4. Your disk speed is slow, and since your DB is large, seeks are longer.

If you don't know what I'm talking about with stock configs, then you probably are running stock configs.

Provided that your DB runs on a server by itself, you might want to consider having settings that look something like this:

work_mem = total memory available / max number of connections. You might not want to mess with this one.
shared_buffers = 25% max mem available, but not greater than 4gb
effective_cache_size = 50% available memory.
maintenance_memory (not exact name, forget off the top of my head) - 256MB is a good working value, unless you have like 1gb of total ram, then scale down appropriately.

Search

SQL performance tunning and DBA thread

Lendarios

Trump's Staff

fred sanford

Lendarios

Trump's Staff

foop_sl

shitlord

Sir Funk

Molten Core Raider

Deruvian

Lord Nagafen Raider

Lendarios

Trump's Staff

foop_sl

shitlord

Deruvian

Lord Nagafen Raider

Lendarios

Trump's Staff

Deruvian

Lord Nagafen Raider

foop_sl

shitlord

Deruvian

Lord Nagafen Raider

Sir Funk

Molten Core Raider

lofwyrx_sl

shitlord

Lendarios

Trump's Staff

foop_sl

shitlord

Etruscus_sl

shitlord