Showing posts with label MSSQL. Show all posts
Showing posts with label MSSQL. Show all posts

Friday, July 25, 2014

Table Index – Part 1, Choosing the right column

Simple to say that Indexing is essential for filtering/sorting/grouping a table having more number of rows. But the real challenge is choosing the right column for index.
Before that, Let’s have a quick understand of Index  & how it is helpful in filtering the rows:
  • Index can be classified as Clustered Index & Non-Clustered/Secondary Index.
  • A Table will have only one Clustered Index, but can have multiple Secondary Index.
  • In MySQL, Primary Key is considered as a Clustered Index.
  • Rows are sorted & stored in the table based on this Clustered Index. The row values are stored closed to the Clustered Index column.
  • A Secondary Index will be maintained in a separate dictionary with that Indexed column (sorted) & the primary key column (to link the clustered index). If Primary Key index is long, secondary index use more space.
  • A example of Secondary Index :
    ix_img1
  • When we filter rows based on indexed column, instead of scanning all rows, only the needed rows will be scanned. (as the indexed values will be sorted)ix_img2
As it is clear there is a cost for creating index, we cannot simply create index for all the columns we use in WHERE clause.  Index affects the performance of Write Queries (Insert / Update / Delete) and it also consumes space in database.

 So selecting the right column is important.
In General, A column having more distinct values is good for Index.

 Rule of thumb :
Index Selectivity Formulae = (Count of distinct values/Number of Rows) * 100%
If Selectivity is > 8 % it can be considered for Index
In other simple words, when you filter a query by using an indexed column, you should get less than 15% of rows.

For example, if you need to filter an 10,000 records table based on Gender (the 2 distinct values will be Male / Female)
Selectivity = 2 / 10000 * 100% = 0.002% and so it should not be used for index.
ix_img3

As we can see cursor needs to scan more rows in the Index table & then map with the prime table. This has no advantage than scanning entire rows of the Primary table.
Also this selectivity formulae cannot be followed simply for all. We should consider the volume of data for each distinct value.

 For example, an Indian Ecommerce store having 500 customers from more than 50 countries. Based on our rule, it is eligible. But when you analyze the data, if 400 customers are from India alone, then there is no need for Index.

 So with selectivity formulae, you have to judge based on the possible volume of data.
In Part 2, We can see how to pick the columns for Multiple Column Index.

Friday, January 7, 2011

MSSQL & Event ID: 17137

In Event Viewer > Application, If you have noticed Thousands of Events with details "Starting up database.." Event ID: 17137, here is the solution :

Make sure to set all the database in your SQL Server as Auto_Close=False. By default in SQL Server Express Editions, Auto_Close was set to True. If Auto_Close is true, SQL Server close the DB when the last connection is closed. So the databases were stopped & started regularly.

To Set Auto_Close = False
In SQL Server Manager > Right Click Database > Properties > Options, you can find the option.
OR
ALTER DATABASE SET AUTO_CLOSE OFF