Monday, March 19, 2012
Generating an Index
sure it can be done.
I'll try to keep the information limited to only what's relevant to
what I have and what I am trying to achieve with this query.
I have a table that contains around 100,000 records. For the sake of
this discussion, assume just two columns:
ID Data
1 000000000000
2 010000000000
3 011111111111
4 101100011101
5 110100011011
6 111100000000
7 111100000111
8 111100010111
9 111100011011
10 111111111111
The Data column contains only 1's and 0's.
The Data column is a text column, not numeric.
The Data column is actually 255 chars long. (I limited it above to 12
for this example only)
Duplicates on the Data column are allowed and do exist.
With 100,000 records, you would note that in the above example record
10 would actually be record 100,000.
My aim is to somehow sort the data (by creating a third column) so that
the records are in order of "string distance" or similar. In other
words, so that similar strings are located next to (or as close as
possible to) each other.
For example, taking the data above:
The "distance" between record 8 and record 9 is 2 (ie Only two
positions in the Data are not the same)
The "distance" between record 3 and record 4 is 6
However, the "distance" between record 3 and record 10 is only 1,
but sorted in the normal fashion, there would be some 99,990 records
between them.
I was hoping that as I add records to this table I could calculate a
number or a code or something to create a third column that could be
indexed. Accessing records using this index would give me the records
in order of "String Distance" or similar.
I have looked up functions such as "Levenshtein - Edit String
Distance", which is fine when I have to strings to compare. I can't
see it helping in generating an index though.
I hope I have been clear in my explanation.
I would really appreciate any comments or discussion that could help me
achieve this.
Thanks for your time,
MJPlease post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, data types, etc. in
your schema are. Sample data is also a good idea, along with clear
specifications. It is very hard to debug code when you do not let us
see it.
Based on your narrative, did you mean to post something like this?
CREATE TABLE Foobar
(foo_id INTEGER NOT NULL PRIMARY KEY,
vector CHAR(255) NOT NULL);
INSERT INTO Foobar VALUES(1, '000000000000');
INSERT INTO Foobar VALUES(2, '010000000000')
INSERT INTO Foobar VALUES(3, '011111111111');
INSERT INTO Foobar VALUES(4, '101100011101');
INSERT INTO Foobar VALUES(5, '110100011011');
INSERT INTO Foobar VALUES(6, '111100000000');
INSERT INTO Foobar VALUES(7, '111100000111');
INSERT INTO Foobar VALUES(8, '111100010111');
INSERT INTO Foobar VALUES(9, '111100011011');
INSERT INTO Foobar VALUES(10, '111111111111');
Look up the difference in rows and records, and what a scalar value is.
These strings look like (uuggh!) binary vectors, not scalar values as
required by First Normal Form.
There is no way to use string distance as a sorting order because it is
a **binary** relationship.
Look up the use of the Sequence Auxiliary table.
CREATE PROCEDURE StringDistance (@.foo_1 INTEGER, @.foo_2 INTEGER)
AS
SELECT SUM (CASE
WHEN SUBSTRING (F1.vector, S.seq, 1)
<> SUBSTRING (F2.vector, S.seq, 1)
THEN 1 ELSE 0 END) AS distance
FROM Foobar AS F1, Foobar AS F2, Sequence AS S
WHERE F1.foo_id = @.foo_1
AND F2.foo_id = @.foo_2
AND S.seq <= 255 -- or whatever
GROUP BY F1.foo_id, F2.foo_id;|||Hi Mary, I created an example that may help you to solve the problem.
Instead of calculating the difference between two binary
representations, i used products table in Northwind database to gererate a
new computed column that represents the price difference between two
consecutives products. In your case, you can replace this basic function to
any that you wish. Here's the script:
USE Northwind
GO
-- Creates a new table just like products
SELECT *
INTO ProductsComputed
FROM Products
GO
-- Creates a function that calculates the difference between the current
record and the one just before (it selects all the products lower than the
actual key and pick -- the first one, in the inverse order).
-- ISNULL is needed due to productId 1
CREATE FUNCTION fnMoneyDifference (@.ProductId INT)
RETURNS MONEY
AS
BEGIN
DECLARE @.Value Money
SELECT @.Value = P.UnitPrice -
ISNULL ((SELECT TOP 1 P2.UnitPRice
FROM ProductsComputed AS P2
WHERE P2.ProductID < P.ProductID
ORDER BY ProductID DESC), 0)
FROM ProductsComputed AS P
WHERE P.ProductID = @.ProductID
RETURN @.Value
END
GO
-- Add a computed columns to the table, using the function and passing the
product key
ALTER TABLE ProductsComputed
ADD ComputedColumn AS dbo.fnMoneyDifference(ProductID)
GO
-- Verify the result
SELECT * FROM ProductsComputed
GO
Using computed column you guarantee that the difference will always be
updated, but be careful, this can consume many resources when a query is
issued against a big table. If the data almost doesn't change, it's better
to create a real column and populate the values instead of using a computed
column.
I hope it helps.
[]s
Luciano Caixeta Moreira
Brazil
"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:1126146910.541347.236630@.g14g2000cwa.googlegroups.com...
> Please post DDL, so that people do not have to guess what the keys,
> constraints, Declarative Referential Integrity, data types, etc. in
> your schema are. Sample data is also a good idea, along with clear
> specifications. It is very hard to debug code when you do not let us
> see it.
> Based on your narrative, did you mean to post something like this?
> CREATE TABLE Foobar
> (foo_id INTEGER NOT NULL PRIMARY KEY,
> vector CHAR(255) NOT NULL);
> INSERT INTO Foobar VALUES(1, '000000000000');
> INSERT INTO Foobar VALUES(2, '010000000000')
> INSERT INTO Foobar VALUES(3, '011111111111');
> INSERT INTO Foobar VALUES(4, '101100011101');
> INSERT INTO Foobar VALUES(5, '110100011011');
> INSERT INTO Foobar VALUES(6, '111100000000');
> INSERT INTO Foobar VALUES(7, '111100000111');
> INSERT INTO Foobar VALUES(8, '111100010111');
> INSERT INTO Foobar VALUES(9, '111100011011');
> INSERT INTO Foobar VALUES(10, '111111111111');
>
> Look up the difference in rows and records, and what a scalar value is.
> These strings look like (uuggh!) binary vectors, not scalar values as
> required by First Normal Form.
> There is no way to use string distance as a sorting order because it is
> a **binary** relationship.
> Look up the use of the Sequence Auxiliary table.
> CREATE PROCEDURE StringDistance (@.foo_1 INTEGER, @.foo_2 INTEGER)
> AS
> SELECT SUM (CASE
> WHEN SUBSTRING (F1.vector, S.seq, 1)
> <> SUBSTRING (F2.vector, S.seq, 1)
> THEN 1 ELSE 0 END) AS distance
> FROM Foobar AS F1, Foobar AS F2, Sequence AS S
> WHERE F1.foo_id = @.foo_1
> AND F2.foo_id = @.foo_2
> AND S.seq <= 255 -- or whatever
> GROUP BY F1.foo_id, F2.foo_id;
>|||Hi Luciano,
Thank you very much for your response. I really appreciate you helping
me out with my problem.
I'll see what I can do to convert your code to my needs.
Thanks again,
MJ|||let's say you ahve data like this
ID Data
1 000000000000
2 100000000000
3 010000000000
4 001000000000
5 000100000000
rows 2-5 are all only 1 digit different foprm row 1. Hiow would you
sort them?|||Ok Mary. If you need any further assistance, just send a new message.
Good luck.
Luciano
"Mary" <maryjones11289@.hotmail.com> wrote in message
news:1126209734.166547.191330@.z14g2000cwz.googlegroups.com...
> Hi Luciano,
> Thank you very much for your response. I really appreciate you helping
> me out with my problem.
> I'll see what I can do to convert your code to my needs.
> Thanks again,
> MJ
>|||That's exactly my question.
I'm not looking to sort on the "data" column...I was hoping to add a
third column which contains some sort of "weighting factor" or
something similar, so that when I sort the data on this column I end up
with records with the smallest "distance" between each other "close" to
each other.
You are correct with your sample data that all of them have a distance
of 1 from each other and I would expect that when sorted by a weight
factor they will reside "close" to each other, but consider:
ID Data
1 00000000000
.
.
.
55 011111111111
.
.
.
100,000 111111111111
Record 55 and record 100000 have a distance of "1", yet are over 90,000
records apart when sorted on the "data" column.
If I could weight this column and sort by that weight, I am hoping to
get record 55 "as close as possible" to record 100,000.
Even if I have to read 50-100 records to get it would be much better
than having to read over 90,000
MJ|||On 7 Sep 2005 18:55:45 -0700, Mary wrote:
>Hi, I need some assistance with a query. To be honest, I'm not even
>sure it can be done.
Hi Mary,
I'm pretty sure it can't be done. Any ordering must be defined on the
contents of the row, not on the difference between a row and the
"previous" or "next" one, as you can only determine the "previous" and
"next" after applying the ordering.
Simplifying your example even further, to length 2:
ID Data
1 00
2 01
3 10
4 11
Now, which row should be first? There is no answer - because for each
row, there are two rows with a difference in one position, and one row
with a difference in two positions.
An ordering that WOULD bne possible is an ordering on the number of
position changed from a chosen "first" value. Let's say we arbitrarily
choose "01" to be the first value in the sample above. That would make
row #2 the first (no difference). The second and third rows would be #1
and #4 (or #4 and #1 - that's arbitrary again), that both differ in one
position from "01", and row 3 ("10" - both position changed) goes last.
But that is probably not the ordering you want, because the second and
third row now differ in two positions, and the third anf fourth only on
one!
I hope this example has managed to cinvince you that the ordering you
require is indeed impossible - not as a result of a limitation in SQL
Server, but because the requirement itself is insuffucient to define
"the" correct order.
The next step, of course, is to find out why you (think that you) need
the data to be ordered in this way, and to consider alternate ways of
achieveing the same goal. In short: what is the actual business
requirement that you're trying to solve?
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)
Wednesday, March 7, 2012
Generate SQL for a table
Hi
I have two questions
i) is there any SP which will generates SQL Script for a table including all constarint(PK,FK), index etc. and the table data in the format of "INSERT INTO" . if NO, how can i create to do so, or any other SP which will help me( may be third party one)
ii)Where had "Scptxfr.exe" gone in SQL Server 2005 ?
Regards,
Thanks.
Gurpreet S. Gill
There is no SP. You will have to create one yourself or use SMO for example. I don't know the answer to the 2nd one. I am moving the thread to Tools forum to see if someone there might know.|||Gurpreet Singh Gill wrote:
Hi
I have two questions
i) is there any SP which will generates SQL Script for a table including all constarint(PK,FK), index etc. and the table data in the format of "INSERT INTO" . if NO, how can i create to do so, or any other SP which will help me( may be third party one)
ii)Where had "Scptxfr.exe" gone in SQL Server 2005 ?
Regards,
Thanks.
Gurpreet S. Gill
Hi,
if you are a .Net developer the insert scripts are already genrated for you by the wizard if you have pk in every tables. You might be interested in using it. even if you are not a VS developer
the answer to the second question is just a click away. here. check this out
look for this topic Where to Find SCPTXFR?
http://www.sqlservercentral.com/columnists/jreade/howtoscheduleasqlserverdatabasecreationscript.asp
thanks,
joey
|||joey
I dont want to create it by .NET code infact only by using the some SP, or inside SSMS only.
I read the artical but i dont find any thing about where is the SCPTXFR.EXE had gone, only one line say, "this is missing", but they are refering the Beta version of SQL Server.
Regards,
Thanks.
Gurpreet S. Gill
|||the title is misleading. sorry
anyway you can still use the v2000 version of scptxfr.exe
with 2005
|||
I know about scptxfr.exe, I want only one table Script(not all), with Data too.
|||
hi,
got this one from another thread
you can use the sql server wizard to generate scripts insert, update, delete query
Run sql server enterprise manager > tools > wizards > database > create stored procedures wizards.
choose the table and click on insert.
on the completing tab click on edit then click edit Sql
regards,
joey
|||
Sorry to say man, i want to create the SP which will Generate the SQL Script(including Constaints, indexes, checks..etc) for the table as well as the data of that table. The aim is to create the .sql file, which can be used for migration to other DB with the table(create script) with data.
|||I was struggling with the same case a few months back in SQL Server 2005 as the wizard does not include drop statements before creating the object. Unlike SQL Server 2000, the 2005 version uses the IF NOT EXISTS logic instead of the IF EXISTS logic. This is where chanced upon Scriptio, a tool created by SQLTeam.com. From the looks of your requirement, you definitely have to write a script (or an application) to generate an INSERT Statement if you want to migrate the data as well. Why don't you just generate a script to build the database and use SSIS to migrate the data afterwards?|||
Basically, the things are bit different for me, we had a setup at client end, which uses the SQL Server 2005 as a backend(before that the application was using the SQL Base by Centura/Gurpta Technologies).Client dont have SSMS(only Database Services), infact they are using the SQL Talk(again the tool from the same). they want the same funcrionalties as provided by SQL Base, this procedure, which i need is one of them
Could any body help me.
Regards,
Thanks.
Gurpreet S. Gill
generate sp using wizards - have problem
When I change some tables dbowner (design table > table and index properties > owner) I can’t generate sp using wizards (tools > wizards > database > create stored procedures wizards). What is the problem? (User dbo haven’t any problem)
Which application are you using for that ?HTH, Jens K. Suessmeyer.
http://www.sqlserver2005.de|||
Tharindu Dhaneenja wrote:
When I change some tables dbowner (design table > table and index properties > owner) I can’t generate sp using wizards (tools > wizards > database > create stored procedures wizards). What is the problem? (User dbo haven’t any problem)
does the user have the right to script? i mean permission wise?
|||
I am using sql server 2000. Manually I can write sp using my table
Eg : select EmpNo, EmpName from tharindu.emp (0 errors)
Sp name: tharindu.insertEmp_sp (after exec 0 errors (normal insert update sp))
My problem is I can’t create sp (insert record sp) using wizards.
|||
hi,
\which wizard are you referring to?
if it is the dot.net wizard, you need a primary key on the table
or you should be using the required data provider in the connection string
thanks
joey
|||Run sql server enterprise manager > tools > wizards > database > create stored procedures wizards.
|||could it be permission problem.
check if you have rights to create proc in the Db
|||
Yes I have rights to create proc in the Db, for example after login SQL query analyzer using my user (“Tharindu”) I can create, delete all sp and tables. But only problem is I can’t create sp using wizards.
This error prompt,
http://tharindu.50webs.com/err/error.htm
|||
Finally I see my error jpg doesn’t open.
Still can not find good solution I think sql server wizards support only user “dbo”
More information about this matter read this, Generate sp using wizards display flowing Error…(Error 21776)
generate sp using wizards - have problem
When I change some tables dbowner (design table > table and index properties > owner) I can’t generate sp using wizards (tools > wizards > database > create stored procedures wizards). What is the problem? (User dbo haven’t any problem)
Which application are you using for that ?HTH, Jens K. Suessmeyer.
http://www.sqlserver2005.de|||
Tharindu Dhaneenja wrote:
When I change some tables dbowner (design table > table and index properties > owner) I can’t generate sp using wizards (tools > wizards > database > create stored procedures wizards). What is the problem? (User dbo haven’t any problem)
does the user have the right to script? i mean permission wise?
|||
I am using sql server 2000. Manually I can write sp using my table
Eg : select EmpNo, EmpName from tharindu.emp (0 errors)
Sp name: tharindu.insertEmp_sp (after exec 0 errors (normal insert update sp))
My problem is I can’t create sp (insert record sp) using wizards.
|||
hi,
\which wizard are you referring to?
if it is the dot.net wizard, you need a primary key on the table
or you should be using the required data provider in the connection string
thanks
joey
|||Run sql server enterprise manager > tools > wizards > database > create stored procedures wizards.
|||could it be permission problem.
check if you have rights to create proc in the Db
|||
Yes I have rights to create proc in the Db, for example after login SQL query analyzer using my user (“Tharindu”) I can create, delete all sp and tables. But only problem is I can’t create sp using wizards.
This error prompt,
http://tharindu.50webs.com/err/error.htm
|||
Finally I see my error jpg doesn’t open.
Still can not find good solution I think sql server wizards support only user “dbo”
More information about this matter read this, Generate sp using wizards display flowing Error…(Error 21776)