Showing posts with label case. Show all posts
Showing posts with label case. Show all posts

Monday, March 26, 2012

Generic question about leaf-level of non-clustered indexes

In the case of a heap, I understand the leaf-level contains some sort
of RowId which points directly to the right page in the table. Good,
pretty quick.
If the table contains a clustered index, I understand this is what's
used as bookmark in all the non-clustered indexes of that table.
Question:
Couldn't this be extremely inefficient if the clustered index is not
unique?
If I created a non unique clustered index on a (small) column with low
selectivity, I might end up with thousands of rows for each value in
the clustered index.
Then a bookmark on such a value does not seem too good, does it?
I suppose there would be several pages read before reaching the actual
data page...
Thanks
EricThis is a multi-part message in MIME format.
--=_NextPart_000_00F4_01C3777E.0F0200C0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Under the covers, it builds a "uniqueifier", so that all "keys" in the =clustered index are unique, even if you created it on a non-unique =column.
-- Tom
---
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Columnist, SQL Server Professional
Toronto, ON Canada
www.pinnaclepublishing.com/sql
"Eric Mamet" <eric_mamet_test@.yahoo.co.uk> wrote in message =news:11269dcb.0309100520.6d6a4e1f@.posting.google.com...
In the case of a heap, I understand the leaf-level contains some sort
of RowId which points directly to the right page in the table. Good,
pretty quick.
If the table contains a clustered index, I understand this is what's
used as bookmark in all the non-clustered indexes of that table.
Question: Couldn't this be extremely inefficient if the clustered index is not
unique?
If I created a non unique clustered index on a (small) column with low
selectivity, I might end up with thousands of rows for each value in
the clustered index.
Then a bookmark on such a value does not seem too good, does it?
I suppose there would be several pages read before reaching the actual
data page...
Thanks
Eric
--=_NextPart_000_00F4_01C3777E.0F0200C0
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
&

Under the covers, it builds a ="uniqueifier", so that all "keys" in the clustered index are unique, even if you created =it on a non-unique column.
-- Tom
---T=homas A. Moreau, BSc, PhD, MCSE, MCDBASQL Server MVPColumnist, SQL =Server ProfessionalToronto, ON Canadahttp://www.pinnaclepublishing.com/sql">www.pinnaclepublishing.com=/sql
"Eric Mamet" wrote in message news:11269d=cb.0309100520.6d6a4e1f@.posting.google.com...In the case of a heap, I understand the leaf-level contains some sortof =RowId which points directly to the right page in the table. Good,pretty quick.If the table contains a clustered index, I understand this =is what'sused as bookmark in all the non-clustered indexes of that table.Question: Couldn't this be extremely inefficient if =the clustered index is notunique?If I created a non unique clustered =index on a (small) column with lowselectivity, I might end up with =thousands of rows for each value inthe clustered index.Then a bookmark on =such a value does not seem too good, does it?I suppose there would be =several pages read before reaching the actualdata page...ThanksEric

--=_NextPart_000_00F4_01C3777E.0F0200C0--|||This is a multi-part message in MIME format.
--=_NextPart_000_0037_01C3782F.68E72280
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
To add to Tom's comments, From memory this will add four bytes to the =size of your field for the subsequent non unique values. Plus an =addition two bytes because you now have a non fixed length column. As =you can see the extra overhead can be huge on something like an INT. If =there is another column you can add to the clustered index that can make =it unique, then I would consider this as it may actually use less space. = I.e. two INTs unique may use less space that one INT with low =selectivity.
-- Barry McAuslin
Crucis Technical Solutions Limited
support@.sqlfe.com
http://www.sqlfe.com
"Tom Moreau" <tom@.dont.spam.me.cips.ca> wrote in message =news:exfAK%235dDHA.568@.TK2MSFTNGP11.phx.gbl...
Under the covers, it builds a "uniqueifier", so that all "keys" in the =clustered index are unique, even if you created it on a non-unique =column.
-- Tom
---
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Columnist, SQL Server Professional
Toronto, ON Canada
www.pinnaclepublishing.com/sql
"Eric Mamet" <eric_mamet_test@.yahoo.co.uk> wrote in message =news:11269dcb.0309100520.6d6a4e1f@.posting.google.com...
In the case of a heap, I understand the leaf-level contains some sort
of RowId which points directly to the right page in the table. Good,
pretty quick.
If the table contains a clustered index, I understand this is what's
used as bookmark in all the non-clustered indexes of that table.
Question: Couldn't this be extremely inefficient if the clustered index is not
unique?
If I created a non unique clustered index on a (small) column with low
selectivity, I might end up with thousands of rows for each value in
the clustered index.
Then a bookmark on such a value does not seem too good, does it?
I suppose there would be several pages read before reaching the actual
data page...
Thanks
Eric
--=_NextPart_000_0037_01C3782F.68E72280
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
&

To add to Tom's comments, From memory =this will add four bytes to the size of your field for the subsequent non unique =values. Plus an addition two bytes because you now have a non fixed length =column. As you can see the extra overhead can be huge on something like an =INT. If there is another column you can add to the clustered index that can =make it unique, then I would consider this as it may actually use less =space. I.e. two INTs unique may use less space that one INT with low selectivity.
-- Barry McAuslin
Crucis Technical Solutions Limited
support@.sqlfe.comhttp://www.sqlfe.com">http://www.sqlfe.com
"Tom Moreau" = wrote in message news:exfAK%235dDHA.5=68@.TK2MSFTNGP11.phx.gbl...
Under the covers, it builds a ="uniqueifier", so that all "keys" in the clustered index are unique, even if you created =it on a non-unique column.
-- Tom

=---T=homas A. Moreau, BSc, PhD, MCSE, MCDBASQL Server MVPColumnist, SQL =Server ProfessionalToronto, ON Canadahttp://www.pinnaclepublishing.com/sql">www.pinnaclepublishing.com=/sql


"Eric Mamet" wrote in message news:11269d=cb.0309100520.6d6a4e1f@.posting.google.com...In the case of a heap, I understand the leaf-level contains some =sortof RowId which points directly to the right page in the table. Good,pretty quick.If the table contains a clustered index, I understand =this is what'sused as bookmark in all the non-clustered indexes of that table.Question: Couldn't this be extremely inefficient if =the clustered index is notunique?If I created a non unique =clustered index on a (small) column with lowselectivity, I might end up with =thousands of rows for each value inthe clustered index.Then a bookmark on =such a value does not seem too good, does it?I suppose there would be =several pages read before reaching the actualdata page...ThanksEric =

--=_NextPart_000_0037_01C3782F.68E72280--|||This question bothered me and it makes much more sense now.
Thanks very much to both of you.

Sunday, February 19, 2012

Generalisation schema Help(GIF Schema included)

I have a CASE(law firm type) table which uses a "case number" as a PK.
Each case can have zero-many "case plans" of three different types of
case plans, namely: (civil, tax, criminal).

I have created 3 different tables for each respective "case plan" type as their
details differ quite significantly. I have created an Audit table which holds
information about updates and inserts made to these tables.

My problem is referencing a specific "case plan" in the Audit table
as the CasePlanID in the Tax,Criminal,Civil tables could be the same, and
the only uniqueness to those tables is the table name (ie: Tax,Criminal,Civil).

I have thought of placing a "CONSTANT" column in each table with the type of the
Case plan as the value, but this introduces a lot of redundancy.
I have also thought of having an audit table for each respective case plan table
but this doesnt seem like good practice.

Perhaps a Generalisation class between Case and caseplans would help.

I also have to report across the case plans too.

Any help would be much appreciated ;-)What's interesting is that you have modelled them like they're inherited (as in OOP). What is the difference between the case plan types? Different columns? Are you likely to add another case plan type at some point in time?

Edit : forgive me i'm being a goon. You clearly state that their details differ quite significantly. What are those details? Can they be normalised?|||It would seem like your case plan type can be seperated into a seperate table as your case will ALWAYS have a plan type. So you can normalise those into another table:

CREATE TABLE CasePlanTypes(
PlanTypeID INT NOT NULL,
PlanName VARCHAR(255)
)|||There will only be 3 case plan types and no more.
The types differ quite a lot. there ARE about 10 columns of similarity between all 3, however, Civil and Criminal have about 20 additional columns and between civil and criminal, about 12 of those 20 are similar in data type, but semantically different. Example:
Tax: {Gross_amount_taxed_for_quarter, Net_Amount_taxed_for_Quarter}
Civil:{Gross_Amount_Reclaimed_for_quarter,Net_amou nt_reclaimed_for_quarter}