SQL Questions and Answers

Published on June 2016 | Categories: Documents | Downloads: 64 | Comments: 0 | Views: 491
of 49
Download PDF   Embed   Report

Comments

Content

DATABASE:
Database is collection of Structured information. These are computer files
stored in hard disk. Databases are designed to maintain large amount of information.
They store data in an organized and structured manner, that makes easy for users to
manage and retrieve data when required.
DBMS:
Database management system is a software program that enables users to create
and maintain databases. A DBMS also allows the users to write queries for an individual
database to perform required actions like retrieving, modifying and deleting data.
DBMS supports Tables to store data in rows and columns.
A RDBMS is a type of DBMS that stores information in the form of related tables.
What is RDBMS?

Relational Data Base Management Systems (RDBMS) are database management systems
that maintain data records and indices in tables. Relationships may be created and
maintained across and among the data and tables. In a relational database, relationships
between data items are expressed by means of tables. Interdependencies among these
tables are expressed by data values rather than by pointers. This allows a high degree of
data independence. An RDBMS has the capability to recombine the data items from
different files, providing powerful tools for data usage.
What is the difference between DBMS and RDBMS?

DBMS includes the theoretical part that how data are stored in a table. It does not relate
tables with another. While RDBMS is the procedural way that includes SQL syntaxes for
relating tables with another and handling data stored in tables. rdbms is object based
database management system while dbms not. rdbms can maintain at many users at same
time while dbms not. the difference between dbms and rdbms is that it doesn't show the
relation while rdbms show the relation and moreover dbms is for small organizations
where rdbms for large amount of data. In DBMS all the tables are treated as different
entities. There is no relation established among these entities. But the tables in RDBMS
are dependent and the user can establish various integrity constraints on these tables so
that the ultimate data used by the user remains correct. In DBMS there are entity sets in
the form of tables but relationship among them is not defined while in RDBMS in each
entity is well defined with a relationship set so as retreive our data fast and easy.
What is ACID Property of Transaction

Basically a Transaction is a Series of work, which either complete successfully or failed
in Group. ACID is a term coined by Andrew Reuter in 1983, which stands for Atomicity,
Consistence, Isolation and Durability. Those are concepts from DBMS. The grouped SQL
statements are called Transactions (or) A transaction is a collection of actions embossed
with ACID properties.

ACID stands for
A - Atomicity
C - Consistency
I - Isolation
D – Durability
What is Normalization?
Normalization is the process of efficiently organizing data in a database. There are two
goals of the normalization process: eliminating redundant data (for example, storing the
same data in more than one table) and ensuring data dependencies make sense (only
storing related data in a table). Both of these are worthy goals as they reduce the amount
of space a database consumes and ensure that data is logically stored.
The Normal Forms
The database community has developed a series of guidelines for ensuring that databases
are normalized. These are referred to as normal forms and are numbered from one (the
lowest form of normalization, referred to as first normal form or 1NF) through five (fifth
normal form or 5NF). In practical applications, you'll often see 1NF, 2NF, and 3NF along
with the occasional 4NF. Fifth normal form is very rarely seen and won't be discussed in
this article.
Before we begin our discussion of the normal forms, it's important to point out that they
are guidelines and guidelines only. Occasionally, it becomes necessary to stray from them
to meet practical business requirements. However, when variations take place, it's
extremely important to evaluate any possible ramifications they could have on your
system and account for possible inconsistencies. That said, let's explore the normal forms.
First Normal Form (1NF)
First normal form (1NF) sets the very basic rules for an organized database:
 Eliminate duplicative columns from the same table.
 Create separate tables for each group of related data and identify each row with a
unique column or set of columns (the primary key).
Second Normal Form (2NF)
Second normal form (2NF) further addresses the concept of removing duplicative data:
 Meet all the requirements of the first normal form.
 Remove subsets of data that apply to multiple rows of a table and place them in
separate tables.
 Create relationships between these new tables and their predecessors through the
use of foreign keys.
Third Normal Form (3NF)
Third normal form (3NF) goes one large step further:
 Meet all the requirements of the second normal form.
 Remove columns that are not dependent upon the primary key.

Fourth Normal Form (4NF)
Finally, fourth normal form (4NF) has one additional requirement:
 Meet all the requirements of the third normal form.
 A relation is in 4NF if it has no multi-valued dependencies.
Remember, these normalization guidelines are cumulative. For a database to be in 2NF, it
must first fulfill all the criteria of a 1NF database.
What is normalization?

Database normalization is a data design and organization process applied to data
structures based on rules that help build relational databases. In relational database
design, the process of organizing data to minimize redundancy. Normalization usually
involves dividing a database into two or more tables and defining relationships between
the tables. The objective is to isolate data so that additions, deletions, and modifications
of a field can be made in just one table and then propagated through the rest of the
database via the defined relationships.
Normalization is the process of designing a data model to efficiently store data in a
database. The end result is that redundant data is eliminated, and only data related to the
attribute is stored within the table. For example, let's say we store City, State and
ZipCode data for Customers in the same table as Other Customer data. With this
approach, we keep repeating the City, State and ZipCode data for all Customers in the
same area. Instead of storing the same data again and again, we could normalize the data
and create a related table called City. The "City" table could then store City, State and
ZipCode along with IDs that relate back to the Customer table, and we can eliminate
those three columns from the Customer table and add the new ID column. Normalization
rules have been broken down into several forms. People often refer to the third normal
form (3NF) when talking about database design. This is what most database designers try
to achieve: In the conceptual stages, data is segmented and normalized as much as
possible, but for practical purposes those segments are changed during the evolution of
the data model. Various normal forms may be introduced for different parts of the data
model to handle the unique situations you may face. Whether you have heard about
normalization or not, your database most likely follows some of the rules, unless all of
your data is stored in one giant table. We will take a look at the first three normal forms
and the rules for determining the different forms here.
Rules for First Normal Form (1NF)

Eliminate repeating groups. This table contains repeating groups of data in the Software
column.

To follow the First Normal Form, we store one type of software for each record.

Rules for second Normal Form (2NF)

Eliminate redundant data plus 1NF. This table contains the name of the software which is
redundant data.

To eliminate the redundant storage of data, we create two tables. The first table stores a
reference SoftwareID to our new table that has a unique list of software titles.

Rules for Third Normal Form (3NF)

Eliminate columns not dependent on key plus 1NF and 2NF. In this table, we have data
that contains both data about the computer and the user.

To eliminate columns not dependent on the key, we would create the following tables.
Now the data stored in the computer table is only related to the computer, and the data
stored in the user table is only related to the user.

Advantages of normalization

1. Smaller database: By eliminating duplicate data, you will be able to reduce the
overall size of the database.
2. Better performance:
a. Narrow tables: Having more fine-tuned tables allows your tables to have
less columns and allows you to fit more records per data page.
b. Fewer indexes per table mean faster maintenance tasks such as index
rebuilds.
c. Only join tables that you need.
Disadvantages of normalization

1. More tables to join: By spreading out your data into more tables, you increase
the need to join tables.
2. Tables contain codes instead of real data: Repeated data is stored as codes
rather than meaningful data. Therefore, there is always a need to go to the lookup
table for the value.
3. Data model is difficult to query against: The data model is optimized for
applications, not for ad hoc querying.
What are different normalization forms?

1NF: Eliminate Repeating Groups
Make a separate table for each set of related attributes, and give each table a primary key.
Each field contains at most one value from its attribute domain.
2NF: Eliminate Redundant Data
If an attribute depends on only part of a multi-valued key, remove it to a separate table.
3NF: Eliminate Columns Not Dependent On Key
If attributes do not contribute to a description of the key, remove them to a separate table.
All attributes must be directly dependent on the primary key
BCNF: Boyce-Codd Normal Form
If there are non-trivial dependencies between candidate key attributes, separate them out
into distinct tables.
4NF: Isolate Independent Multiple Relationships
No table may contain two or more 1:n or n:m relationships that are not directly related.
5NF: Isolate Semantically Related Multiple Relationships
There may be practical constrains on information that justify separating logically related
many-to-many relationships.
ONF: Optimal Normal Form
A model limited to only simple (elemental) facts, as expressed in Object Role Model
notation.
DKNF: Domain-Key Normal Form
A model free from all modification anomalies.
Remember, these normalization guidelines are cumulative. For a database to be in 3NF, it
must first fulfill all the criteria of a 2NF and 1NF database.

What is Stored Procedure?
A stored procedure is a named group of SQL statements that have been previously
created and stored in the server database. Stored procedures accept input
parameters so that a single procedure can be used over the network by several
clients using different input data. And when the procedure is modified, all clients
automatically get the new version. Stored procedures reduce network traffic and
improve performance. Stored procedures can be used to help ensure the integrity of
the database.
e.g. sp_helpdb, sp_renamedb, sp_depends etc.
A stored procedure is a set of SQL commands that has been compiled and stored on
the database server.
Once the stored procedure has been "stored", client applications can execute the
stored procedure over and over again without sending it to the database server again
and without compiling it again.
Stored procedures improve performance by reducing network traffic and CPU load.

A stored procedure is a precompiled collection of Transact-SQL statements stored
under a name and processed as a unit that you can call from within another
Transact-SQL statement or from the client applications.
SQL Server ships with a number of stored procedures, which can be used for
managing the database and displaying information about databases and users. These
stored procedures are called system stored procedures. The system stored
procedure's name starts with the prefix sp_ to distinguish them from the usercreated stored procedures. The system stored procedures are stored in the system
databases such as master and msdb. You can create your own stored procedures by
using the CREATE PROCEDURE statement. Stored procedures can have input and
output parameters and can issue an integer return code.
Using stored procedures has a number of advantages over giving users direct access
to the underlying data. These are:




Performance reasons
Security reasons
Reliability reasons

Performance Reasons for Using the Stored Procedures
Using stored procedures has a positive benefit to performance. Stored Procedures
run quickly because they do not need to repeat parsing, optimizing and compiling
with each execution. After the first execution, SQL Server has parsed, optimized and
compiled the stored procedure, so they run quickly without needing to repeat the
parsing, optimizing and compiling steps each time the stored procedures executed.
Since stored procedures run on the SQL Server, they reduce the client computer's
loading and can get benefits from the power server hardware. Using stored
procedures instead of heavy-duty queries can reduce network traffic, since your

client will send to server only the stored procedure name (perhaps with some
parameters) instead of large heavy-duty queries text.
Security Reasons for Using the Stored Procedures
Stored procedures can be used to enhance security and conceal underlying data
objects. For example, you can give the users permission to execute the stored
procedure to work with a restricted set of the columns and data, while not allowing
permissions to select or update underlying data objects. By using the store
procedures, the permission management could also be simplified. You can grant
EXECUTE permission on the stored procedure instead of granting permissions on the
underlying data objects.
Reliability Reasons for Using Stored Procedures
Stored procedures can be used to enhance the reliability of your application. For
example, if all clients use the same stored procedures to update the database, the
code base is smaller and easier to troubleshoot for any problems. In this case,
everyone is updating tables in the same order and there will be less risk of
deadlocks. Stored procedures can be used to conceal the changes in database design
too. For example, if you denormalize your database design to provide faster query
performance, you can only change the stored procedure, but applications that use
the results returned by this stored procedure, will not be rewritten.
Syntax
1) CREATE PROCEDURE ProcedureName
AS
Body of the Procedure
2) CREATE PROCEDURE ProcedureName
@ParameterName DataType
AS
Body of the Procedure
EX :
CREATE PROC GetListOfStudentsByGender
@Gdr VARCHAR(12)
AS
SELECT FirstName, LastName,
DateOfBirth, HomePhone, Gender
FROM
Students
WHERE Gender = @Gdr

What is Trigger?
A trigger is a SQL procedure that initiates an action when an event (INSERT, DELETE
or UPDATE) occurs. Triggers are stored in and managed by the DBMS.Triggers are
used to maintain the referential integrity of data by changing the data in a
systematic fashion. A trigger cannot be called or executed; the DBMS automatically
fires the trigger as a result of a data modification to the associated table. Triggers

can be viewed as similar to stored procedures in that both consist of procedural logic
that is stored at the database level. Stored procedures, however, are not event-drive
and are not attached to a specific table as triggers are. Stored procedures are
explicitly executed by invoking a CALL to the procedure while triggers are implicitly
executed. In addition, triggers can also execute stored procedures.
Nested Trigger: A trigger can also contain INSERT, UPDATE and DELETE logic within
itself, so when the trigger is fired because of data modification it can also cause
another data modification, thereby firing another trigger. A trigger that contains data
modification logic within itself is called a nested trigger.
A trigger is an object contained within an SQL Server database that is used to
execute a batch of SQL code whenever a specific event occurs. As the name
suggests, a trigger is ?fired? whenever an INSERT, UPDATE, or DELETE SQL
command is executed against a specific table.
Triggers are associated with a single table, and are automatically executed internally
by SQL Server.
Syntax
CREATE TRIGGER trigger_name 
ON { table | view } 
[ WITH ENCRYPTION ] 

{ { FOR | AFTER | INSTEAD OF } { [ INSERT ] [ , ] [ UPDATE ] [ , ] 
[ DELETE ] } 
[ WITH APPEND ] 
[ NOT FOR REPLICATION ] 
AS 
[ { IF UPDATE ( column ) 
[ { AND | OR } UPDATE ( column ) ] 
[ ...n ] 
| IF ( COLUMNS_UPDATED ( ) { bitwise_operator } updated_bitmask ) 
{ comparison_operator } column_bitmask [ ...n ] 
} ] 
sql_statement [ ...n ] 

}
EX:
CREATE TRIGGER trig_addAuthor 
ON authors 
FOR INSERT 
AS 
­­ Get the first and last name of new author 
DECLARE @newName VARCHAR(100) 
SELECT @newName = (SELECT au_fName + ' ' + au_lName FROM Inserted) 
­­ Print the name of the new author 
PRINT 'New author "' + @newName + '" added.'

The "Inserted" table is a virtual table which contains all of the fields and values from
the actual "INSERT" command that made SQL Server call the trigger in the first
place.
UPDATE and DELETE triggers
Now that we understand how an "INSERT" trigger works, let's take a look at
"UPDATE" and "DELETE" triggers. Here's an "UPDATE" trigger:
CREATE TRIGGER trig_updateAuthor 
ON authors 
FOR UPDATE 
AS 
DECLARE @oldName VARCHAR(100) 
DECLARE @newName VARCHAR(100) 
IF NOT UPDATE(au_fName) AND NOT UPDATE(au_lName) 
BEGIN 
RETURN 
END 
SELECT @oldName = (SELECT au_fName + ' ' + au_lName FROM Deleted) 
SELECT @newName = (SELECT au_fName + ' ' + au_lName FROM Inserted) 
PRINT 'Name changed from "' + @oldName + '" to "' + @newName + '"'
"UPDATE" triggers have access to two virtual tables: Deleted (which contains all of
the fields and values for the records before they were updated), and Inserted (which
contains all of the fields and values for the records after they have been updated).
We could create a "DELETE" trigger on the "authors" table that would do this for us
automatically:
CREATE TRIGGER trig_delAuthor 
ON authors 
FOR DELETE 
AS 
DECLARE @isOnContract BIT 
SELECT @isOnContract = (SELECT contract FROM Deleted) 
IF(@isOnContract = 1) 
BEGIN 
PRINT "Code to notify publisher goes here" 
END
AFTER Triggers
As the name specifies, AFTER triggers are executed after the action of the INSERT, UPDATE, or DELETE
statement is performed. This is the only option available in earlier versions on Microsoft SQL Server.
AFTER triggers can be specified on tables only. Here is a sample trigger creation statement on the Users
table.

Listing 1 (AFTER Trigger example)
------Creating a DML trigger in T-SQL ------CREATE TRIGGER tr_User_INSERT
ON UserTable
FOR INSERT
AS
PRINT GETDATE()
Go
INSERT UserTable (User_Name, Type) VALUES ('James', 'ADMIN')
------ Result --------------Apr 30 2007 7:04AM
INSTEAD OF Triggers
INSTEAD OF triggers are executed in place of the usual triggering action. INSTEAD OF triggers can also
be defined on views with one or more base tables, where they can extend the types of updates a view can
support.
------ Creating a DML trigger in T-SQL ------CREATE TRIGGER tr_User_INSERT
ON UserTable
INSTEAD OF INSERT
AS
PRINT GETDATE()
Go
INSERT UserTable (User_Name, Type) VALUES ('James', 'ADMIN')

What is View?
A view is a virtual table that consists of columns from one or more tables. It is a
query stored as an object. It can be used like the normal table. Normally view cannot
store the data permanently in the table. When we create the view it stores the view
definition schema as object under the concern database.
Indexed views:
The indexed or permanent view is one of the new features introduced in the sql
server 2005 version. We have seen that the view only store the schema definition
and it will get execute and load the data into the virtual table at the time of view
used. But this view creates the permanent view and we can create the indexes on
the table. It allows us to create the instead of trigger.
The indexed view can be created with the WITH SCHEMA BINDING option while
creating the view.
The indexed view has some restrictions like cannot use the TOP, DISTINCT, UNION,
ORDER BY and aggregate functions.
It allows us to use the GROUP BY statement but we cannot use COUNT statement.
Instead of that COUNT_BIG statement can be used.

A simple view can be thought of as a subset of a table. It can be used for retrieving
data, as well as updating or deleting rows. Rows updated or deleted in the view are
updated or deleted in the table the view was created with. It should also be noted
that as data in the original table changes, so does data in the view, as views are the
way to look at part of the original table. The results of using a view are not
permanently stored in the database. The data accessed through a view is actually
constructed using standard T-SQL select command and can come from one to many
different base tables or even other views.
In SQL Server a view represents a virtual table. Just like a real table, a view consists
of rows with columns, and you can retrieve data from a view (sometimes even
update data in a view). The fields in the view’s virtual table are the fields of one or
more real tables in the database. You can use views to join two tables in your
database and present the underlying data as if the data were coming from a single
table, thus simplifying the schema of your database for users performing ad-hoc
reporting. You can also use views as a security mechanism to restrict the data
available to end users. Views can also aggregate data (particularly useful if you can
take advantage of indexed views), and help partition data.
Syntax
CREATE VIEW view_name
[(column_name[,column_name]….)]
[WITH ENCRYPTION]
AS select_statement [WITH CHECK OPTION]
EX:
CREATE VIEW "Order Details Extended" AS
SELECT
"Order Details".OrderID,
"Order Details".ProductID,
Products.ProductName,
"Order Details".UnitPrice,
"Order Details".Quantity,
"Order Details".Discount,
(CONVERT(money,("Order Details".UnitPrice*Quantity*(1Discount)/100))*100) AS ExtendedPrice
FROM
Products
INNER JOIN
"Order Details" ON
Products.ProductID = "Order Details".ProductID

What is Index?
An index, like the index of a book, enables the database retrieve and present data to
the end user with ease. An index can be defined as a mechanism for providing fast
access to table rows and for enforcing constraints.

Indexes can be clustered or non clustered. A clustered index stores data rows in
the table based on their key values. Each table can have only one clustered index as
the key values in the data rows are unique and the index is built on the unique key
column. When a table has a clustered index, it is known as a clustered table. NonClustered indexes have structures that are different from the data rows. A non
clustered index key value is used to point to data rows that contain the key value.
This value is known as row locator. The structure of the row locator is determined on
the basis of the type of storage of the data pages. If the data page is stored as a
heap, a row locator becomes a pointer to a row. If the data page is stored in a
clustered table the row locator is a clustered index key.

An index is a physical structure containing pointers to the data. Indices are created
in an existing table to locate rows more quickly and efficiently. It is possible to create
an index on one or more columns of a table, and each index is given a name. The
users cannot see the indexes; they are just used to speed up queries. Effective
indexes are one of the best ways to improve performance in a database application.
A table scan happens when there is no index available to help a query. In a table
scan SQL Server examines every row in the table to satisfy the query results. Table
scans are sometimes unavoidable, but on large tables, scans have a terrific impact
on performance.
Clustered indexes define the physical sorting of a database table’s rows in the
storage media. For this reason, each database table may have only one clustered
index.
Non-clustered indexes are created outside of the database table and contain a sorted
list of references to the table itself.
Relational databases like SQL Server use indexes to find data quickly when a query is
processed. Creating and removing indexes from a database schema will rarely result
in changes to an application's code; indexes operate 'behind the scenes' in support of
the database engine. However, creating the proper index can drastically increase the
performance of an application.
The SQL Server engine uses an index in much the same way a reader uses a book
index. For example, one way to find all references to INSERT statements in a SQL
book would be to begin on page one and scan each page of the book. We could mark
each time we find the word INSERT until we reach the end of the book. This
approach is pretty time consuming and laborious. Alternately, we can also use the
index in the back of the book to find a page number for each occurrence of the
INSERT statements. This approach produces the same results as above, but with
tremendous savings in time.
When a SQL Server has no index to use for searching, the result is similar to the
reader who looks at every page in a book to find a word: the SQL engine needs to
visit every row in a table. In database terminology we call this behavior a table scan,
or just scan.
A table scan is not always a problem, and is sometimes unavoidable. However, as a
table grows to thousands of rows and then millions of rows and beyond, scans
become correspondingly slower and more expensive.

Consider the following query on the Products table of the Northwind database. This
query retrieves products in a specific price range.
SELECT ProductID, ProductName, UnitPrice
FROM Products WHERE (UnitPrice > 12.5) AND (UnitPrice < 14)
There is currently no index on the Product table to help this query, so the database
engine performs a scan and examines each record to see if UnitPrice falls between
12.5 and 14. In the diagram below, the database search touches a total of 77
records to find just three matches.

Now imagine if we created an index, just like a book index, on the data in the UnitPrice column. Each
index entry would contain a copy of the UnitPrice value for a row, and a reference (just like a page number)
to the row where the value originated. SQL will sort these index entries into ascending order. The index
will allow the database to quickly narrow in on the three rows to satisfy the query, and avoid scanning
every row in the table.
Create An Index
The command specifies the name of the index (IDX_UnitPrice), the table name (Products), and the column
to index (UnitPrice).
CREATE INDEX [IDX_UnitPrice] ON Products (UnitPrice)
To verify that the index is created, use the following stored procedure to see a list of all indexes on the
Products table:
EXEC sp_helpindex Customers
How It Works
The database takes the columns specified in a CREATE INDEX command and sorts the values into a
special data structure known as a B-tree. A B-tree structure supports fast searches with a minimum amount
of disk reads, allowing the database engine to quickly find the starting and stopping points for the query we
are using.

Conceptually, we may think of an index as shown in the diagram below. On the left, each index entry
contains the index key (UnitPrice). Each entry also includes a reference (which points) to the table rows
which share that particular value and from which we can retrieve the required information.

Much like the index in the back of a book helps us to find keywords quickly, so the database is able to
quickly narrow the number of records it must examine to a minimum by using the sorted list of UnitPrice
values stored in the index. We have avoided a table scan to fetch the query results. Given this sketch of how
indexes work, lets examine some of the scenarios where indexes offer a benefit.
Taking Advantage of Indexes
The database engine can use indexes to boost performance in a number of different queries. Sometimes
these performance improvements are dramatic. An important feature of SQL Server 2000 is a component
known as the query optimizer. The query optimizer's job is to find the fastest and least resource intensive
means of executing incoming queries. An important part of this job is selecting the best index or indexes to
perform the task. In the following sections we will examine the types of queries with the best chance of
benefiting from an index.
Searching For Records
The most obvious use for an index is in finding a record or set of records matching a WHERE clause.
Indexes can aid queries looking for values inside of a range (as we demonstrated earlier), as well as queries
looking for a specific value. By way of example, the following queries can all benefit from an index on
UnitPrice:
DELETE FROM Products WHERE UnitPrice = 1
UPDATE Products SET Discontinued = 1 WHERE UnitPrice > 15
SELECT * FROM PRODUCTS WHERE UnitPrice BETWEEN 14 AND 16
Indexes work just as well when searching for a record in DELETE and UPDATE commands as they do for
SELECT statements.

Sorting Records
When we ask for a sorted dataset, the database will try to find an index and avoid sorting the results during
execution of the query. We control sorting of a dataset by specifying a field, or fields, in an ORDER BY
clause, with the sort order as ASC (ascending) or DESC (descending). For example, the following query
returns all products sorted by price:
SELECT * FROM Products ORDER BY UnitPrice ASC
With no index, the database will scan the Products table and sort the rows to process the query. However,
the index we created on UnitPrice (IDX_UnitPrice) earlier provides the database with a presorted list of
prices. The database can simply scan the index from the first entry to the last entry and retrieve the rows in
sorted order.
The same index works equally well with the following query, simply by scanning the index in reverse.
SELECT * FROM Products ORDER BY UnitPrice DESC
Grouping Records
We can use a GROUP BY clause to group records and aggregate values, for example, counting the number
of orders placed by a customer. To process a query with a GROUP BY clause, the database will often sort
the results on the columns included in the GROUP BY. The following query counts the number of products
at each price by grouping together records with the same UnitPrice value.
SELECT Count (*), UnitPrice FROM Products GROUP BY UnitPrice
The database can use the IDX_UnitPrice index to retrieve the prices in order. Since matching prices appear
in consecutive index entries, the database is able count the number of products at each price quickly.
Indexing a field used in a GROUP BY clause can often speed up a query.
Maintaining a Unique Column
Columns requiring unique values (such as primary key columns) must have a unique index applied. There
are several methods available to create a unique index. Marking a column as a primary key will
automatically create a unique index on the column. We can also create a unique index by checking the
Create UNIQUE checkbox in the dialog shown earlier. The screen shot of the dialog displayed the index
used to enforce the primary key of the Products table. In this case, the Create UNIQUE checkbox is
disabled, since an index to enforce a primary key must be a unique index. However, creating new indexes
not used to enforce primary keys will allow us to select the Create UNIQUE checkbox. We can also create
a unique index using SQL with the following command:
CREATE UNIQUE INDEX IDX_ProductName On Products (ProductName)
The above SQL command will not allow any duplicate values in the ProductName column, and an index is
the best tool for the database to use to enforce this rule. Each time an application adds or modifies a row in
the table, the database needs to search all existing records to ensure none of values in the new data
duplicate existing values. Indexes, as we should know by now, will improve this search time.
Index Drawbacks
There are tradeoffs to almost any feature in computer programming, and indexes are no exception. While
indexes provide a substantial performance benefit to searches, there is also a downside to indexing. Let's
talk about some of those drawbacks now.

Indexes and Disk Space
Indexes are stored on the disk, and the amount of space required will depend on the size of the table, and
the number and types of columns used in the index. Disk space is generally cheap enough to trade for
application performance, particularly when a database serves a large number of users. To see the space
required for a table, use the sp_spaceused system stored procedure in a query window.
EXEC sp_spaceused Orders
Given a table name (Orders), the procedure will return the amount of space used by the data and all indexes
associated with the table, like so:
Name
rows
reserved
data
------- -------- ----------- -----Orders 830
504 KB
160 KB

index_size
---------320 KB

unused
------24 KB

According to the output above, the table data uses 160 kilobytes, while the table indexes use twice as much,
or 320 kilobytes. The ratio of index size to table size can vary greatly, depending on the columns, data
types, and number of indexes on a table.
Indexes and Data Modification
Another downside to using an index is the performance implication on data modification statements. Any
time a query modifies the data in a table (INSERT, UPDATE, or DELETE), the database needs to update all
of the indexes where data has changed. As we discussed earlier, indexing can help the database during data
modification statements by allowing the database to quickly locate the records to modify, however, we now
caveat the discussion with the understanding that providing too many indexes to update can actually hurt
the performance of data modifications. This leads to a delicate balancing act when tuning the database for
performance.
In decision support systems and data warehouses, where information is stored for reporting purposes, data
remains relatively static and report generating queries outnumber data modification queries. In these types
of environments, heavy indexing is commonplace in order to optimize the reports generated. In contrast, a
database used for transaction processing will see many records added and updated. These types of
databases will use fewer indexes to allow for higher throughput on inserts and updates.
Every application is unique, and finding the best indexes to use for a specific application usually requires
some help from the optimization tools offered by many database vendors. SQL Server 2000 and Access
include the Profiler and Index Tuning Wizard tools to help tweak performance.
Now we have enough information to understand why indexes are useful and where indexes are best
applied. It is time now to look at the different options available when creating an index and then address
some common rules of thumb to use when planning the indexes for your database.
Clustered Indexes
Earlier in the article we made an analogy between a database index and the index of a book. A book index
stores words in order with a reference to the page numbers where the word is located. This type of index for
a database is a nonclustered index; only the index key and a reference are stored. In contrast, a common
analogy for a clustered index is a phone book. A phone book still sorts entries into alphabetical order. The
difference is, once we find a name in a phone book, we have immediate access to the rest of the data for the
name, such as the phone number and address.
For a clustered index, the database will sort the table's records according to the column (or columns)
specified by the index. A clustered index contains all of the data for a table in the index, sorted by the index

key, just like a phone book is sorted by name and contains all of the information for the person inline. The
nonclustered indexes created earlier in the chapter contain only the index key and a reference to find the
data, which is more like a book index. You can only create one clustered index on each table.
In the diagram below we have a search using a clustered index on the UnitPrice column of the Products
table. Compare this diagram to the previous diagram with a regular index on UnitPrice. Although we are
only showing three columns from the Products table, all of the columns are present and notice the rows are
sorted into the order of the index, there is no reference to follow from the index back to the data.

A clustered index is the most important index you can apply to a table. If the database engine can use a
clustered index during a query, the database does not need to follow references back to the rest of\ the data,
as happens with a nonclustered index. The result is less work for the database, and consequently, better
performance for a query using a clustered index.
To create a clustered index, simply select the Create As CLUSTERED checkbox in the dialog box we used
at the beginning of the chapter. The SQL syntax for a clustered index simply adds a new keyword to the
CREATE INDEX command, as shown below:
CREATE CLUSTERED INDEX IDX_SupplierID ON Products(SupplierID)
Most of the tables in the Northwind database already have a clustered index defined on a table. Since we
can only have one clustered index per table, and the Products table already has a clustered index
(PK_Products) on the primary key (ProductId), the above command should generate the following error:
Cannot create more than one clustered index on table 'Products'.
Drop the existing clustered index 'PK_Products' before creating another.
As a general rule of thumb, every table should have a clustered index. If you create only one index for a
table, use a clustered index. Not only is a clustered index more efficient than other indexes for retrieval
operations, a clustered index also helps the database efficiently manage the space required to store the table.
In SQL Server, creating a primary key constraint will automatically create a clustered index (if none exists)
using the primary key column as the index key.
Sometimes it is better to use a unique nonclustered index on the primary key column, and place the
clustered index on a column used by more queries. For example, if the majority of searches are for the price
of a product instead of the primary key of a product, the clustered index could be more effective if used on
the price field. A clustered index can also be a UNIQUE index.

A Disadvantage to Clustered Indexes
If we update a record and change the value of an indexed column in a clustered index, the database might
need to move the entire row into a new position to keep the rows in sorted order. This behavior essentially
turns an update query into a DELETE followed by an INSERT, with an obvious decrease in performance. A
table's clustered index can often be found on the primary key or a foreign key column, because key values
generally do not change once a record is inserted into the database.
Composite Indexes
A composite index is an index on two or more columns. Both clustered and nonclustered indexes can be
composite indexes. Composite indexes are especially useful in two different circumstances. First, you can
use a composite index to cover a query. Secondly, you can use a composite index to help match the search
criteria of specific queries. We will go onto more detail and give examples of these two areas in the
following sections.
Covering Queries with an Index
Earlier in the article we discussed how an index, specifically a nonclustered index, contains only the key
values and a reference to find the associated row of data. However, if the key value contains all of the
information needed to process a query, the database never has to follow the reference and find the row; it
can simply retrieve the information from the index and save processing time. This is always a benefit for
clustered indexes.
As an example, consider the index we created on the Products table for UnitPrice. The database copied the
values from the UnitPrice column and sorted them into an index. If we execute the following query, the
database can retrieve all of the information for the query from the index itself.
SELECT UnitPrice FROM Products ORDER BY UnitPrice
We call these types of queries covered queries, because all of the columns requested in the output are
contained in the index itself. A clustered index, if selected for use by the query optimizer, always covers a
query, since it contains all of the data in a table.
For the following query, there are no covering indexes on the Products table.
SELECT ProductName, UnitPrice FROM Products ORDER BY UnitPrice
This is because although the database will use the index on UnitPrice to avoid sorting records, it will need
to follow the reference in each index entry to find the associated row and retrieve the product name. By
creating a composite index on two columns (ProductName and UnitPrice), we can cover this query with the
new index.
Matching Complex Search Criteria
For another way to use composite indexes, let's take a look at the OrderDetails table of Northwind. There
are two key values in the table (OrderID and ProductID); these are foreign keys, referencing the Orders and
Products tables respectively. There is no column dedicated for use as a primary key; instead, the primary
key is the combination of the columns OrderID and ProductID.
The primary key constraint on these columns will generate a composite index, which is unique of course.
The command the database would use to create the index looks something like the following:

CREATE UNIQUE CLUSTERED INDEX PK_Order_Details
ON [Order Details] (OrderID, ProductID)
The order in which columns appear in a CREATE INDEX statement is significant. The primary sort order
for this index is OrderID. When the OrderID is the same for two or more records, the database will sort this
subset of records on ProductID.
The order of columns determines how useful the index is for a query. Consider the phone book sorted by
last name then first name. The phone book makes it easy to find all of the listings with a last name of
Smith, or all of the listings with a last name of Jones and a first name of Lisa, but it is difficult to find all
listings with a first name of Gary without scanning the book page by page.
Likewise, the composite index on Order Details is useful in the following two queries:
SELECT * FROM [Order Details] WHERE OrderID = 11077
SELECT * FROM [Order Details] WHERE OrderID = 11077 AND ProductID = 13
However, the following query cannot take advantage of the index we created since ProductID is the second
part of the index key, just like the first name field in a phone book.
SELECT * FROM [Order Details] WHERE ProductID = 13
In this case, ProductID is a primary key, however, so an index does exist on the ProductID column for the
database to use for this query.
Suppose the following query is the most popular query executed by our application, and we decided we
needed to tune the database to support it.
SELECT ProductName, UnitPrice FROM Products ORDER BY UnitPrice
We could create the following index to cover the query. Notice we have specified two columns for the
index: UnitPrice and ProductName (making the index a composite index):
CREATE INDEX IX_UnitPrice_ProductName ON Products(UnitPrice,
ProductName)
While covered queries can provide a performance benefit, remember there is a price to pay for each index
we add to a table, and we can also never cover every query in a non-trivial application.

What is the difference between clustered and a non-clustered index?

Clustered index will be created implicitly when we have primary key column in the
table.There we can have only one primary key. So There will be only one clustered
index.
Non clustered index should be created explicitly. There we can have up to 249 non
clustered index as per sql server 2005 and 999 as per the sql server 2008.

When compared with non clustered index, the clustered index will retrieve the result
very quickly.Because the data will be stored on the same node of index key.
But in non clustered index the data will be on the separate page of the memory
location and node will have the reference page of the data.
A clustered index is a special type of index that reorders the way records in the table
are physically stored. Therefore table can have only one clustered index. The leaf
nodes of a clustered index contain the data pages.
A nonclustered index is a special type of index in which the logical order of the index
does not match the physical stored order of the rows on disk. The leaf node of a
nonclustered index does not consist of the data pages. Instead, the leaf nodes
contain index rows.
What are the different index configurations a table can have?
A table can have one of the following index configurations:
No indexes
A clustered index
A clustered index and many nonclustered indexes
A nonclustered index
Many nonclustered indexes

What is cursors?
cursors are the SLOWEST way to access data inside SQL Server. The should only be
used when you truly need to access one row at a time.

Cursor is a database object used by applications to manipulate data in a set on a
row-by-row basis, instead of the typical SQL commands that operate on all the rows
in the set at one time.
In order to work with a cursor we need to perform some steps in the following order:
Declare cursor
Open cursor
Fetch row from the cursor
Process fetched row
Close cursor
Deallocate cursor
Syntax:
DECLARE cursor_name [INSENSITIVE] [SCROLL] CURSOR
FOR select_statement
[FOR {READ ONLY | UPDATE [OF column_name [,...n]]}]

where

cursor_name - The name of the server side cursor, must contain from 1 to 128 characters.
INSENSITIVE - Specifies that cursor will use a temporary copy of the data instead of base tables.
This cursor does not allow modifications and modifications made to base tables are not reflected
in the data returned by fetches made to this cursor.
SCROLL - Specifies that cursor can fetch data in all directions, not only sequentially until the end
of the result set. If this argument is not specified, FETCH NEXT is the only fetch option
supported.
select_statement - The standard select statement, cannot contain COMPUTE, COMPUTE BY,
FOR BROWSE, and INTO keywords.
READ ONLY - Specifies that cursor cannot be updated.
UPDATE [OF column_name [,...n]] - Specifies that all cursor's columns can be updated (if OF
column_name [,...n] is not specified), or only the columns listed in the OF column_name [,...n] list
allow modifications.
EX:
DECLARE @AuthorID char(11)
DECLARE c1 CURSOR READ_ONLY
FOR
SELECT au_id
FROM authors
OPEN c1
FETCH NEXT FROM c1
INTO @AuthorID
WHILE @@FETCH_STATUS = 0
BEGIN
PRINT @AuthorID
FETCH NEXT FROM c1
INTO @AuthorID
END
CLOSE c1
DEALLOCATE c1

What is the use of DBCC commands?
DBCC stands for database consistency checker. We use these commands to check the consistency of the
databases, i.e., maintenance, validation task and status checks.
E.g. DBCC CHECKDB - Ensures that tables in the db and the indexes are correctly linked.
DBCC CHECKALLOC - To check that all pages in a db are correctly allocated.
DBCC CHECKFILEGROUP - Checks all tables file group for any damage.

What is a Linked Server?
Linked Servers is a concept in SQL Server by which we can add other SQL Server to a Group and query
both the SQL Server dbs using T-SQL Statements. With a linked server, you can create very clean, easy to
follow, SQL statements that allow remote data to be retrieved, joined and combined with local data.
Storped Procedure sp_addlinkedserver, sp_addlinkedsrvlogin will be used add new Linked Server.
Think of a Linked Server as an alias on your local SQL server that points to an
external data source. This external data source can be Access, Oracle, Excel or
almost any other data system that can be accessed by OLE or ODBC--including other
MS SQL servers. An MS SQL linked server is similar to the MS Access feature of
creating a "Link Table."
Why use a Linked Server?
With a linked server, you can create very clean, easy to follow, SQL statements that
allow remote data to be retrieved, joined and combined with local data.
While it would be convenient to have all of our business data in one place, there are
too many obstacles such as Vendor applications built for a specific data store, data
sets too large for one server, legacy flat file applications that are cost prohibitive to
recreate and changing business standards, preventing this from happening.
"Replication Manager" has made moving data from one SQL Server to another on a
regular basis relatively easy. However, duplicating data to an application server is not
always the best solution. If your source is large, and you cannot predict what subset
of data you will need, then a linked server may be a better solution.
If you have a very large data set, there may be performance benefits to splitting
your data into pieces, and moving those pieces onto different servers. Then using
distributed partitioned views to present the data as one source. If so, linked servers
are the technology that makes it possible.
Why not use a Linked Server?
If the remote data is not yours, and the owning department will not allow you
remote access, then a linked server is out. You will have to rely on some type of
scheduled pickup and exchange.
When absolute, best possible performance is required, local data will out perform a
linked server.
If the physical link between your SQL Server and the remote data is slow, or not
reliable, then a linked server is not a good solution.

What is Collation?
Collation refers to a set of rules that determine how data is sorted and compared. Character data is sorted

using rules that define the correct character sequence, with options for specifying case-sensitivity, accent
marks, kana character types and character width.
Case sensitivity
If A and a, B and b, etc. are treated in the same way then it is case-insensitive. A
computer treats A and a differently because it uses ASCII code to differentiate the
input. The ASCII value of A is 65, while a is 97. The ASCII value of B is 66 and b is
98.
Accent sensitivity
If a and á, o and ó are treated in the same way, then it is accent-insensitive. A
computer treats a and á differently because it uses ASCII code for differentiating the
input. The ASCII value of a is 97 and áis 225. The ASCII value of o is 111 and ó is
243.
Kana Sensitivity
When Japanese kana characters Hiragana and Katakana are treated differently, it is
called Kana sensitive.
Width sensitivity
When a single-byte character (half-width) and the same character when represented
as a double-byte character (full-width) are treated differently then it is width
sensitive.
What are different type of Collation Sensitivity?
Case sensitivity
A and a, B and b, etc.
Accent sensitivity
a and á, o and ó, etc.
Kana Sensitivity
When Japanese kana characters Hiragana and Katakana are treated differently, it is called Kana sensitive.
Width sensitivity
When a single-byte character (half-width) and the same character when represented as a double-byte
character (full-width) are treated differently then it is width sensitive.

What’s the difference between a primary key and a unique key?
Both primary key and unique enforce uniqueness of the column on which they are defined. But by default
primary key creates a clustered index on the column, where are unique creates a nonclustered index by
default. Another major difference is that, primary key doesn’t allow NULLs, but unique key allows one
NULL only.
1) By default Primary Key will generate Clustered Index
whereas Unique Key will Generate Non-Clustered Index.

2) Primary Key is a combination of Unique and NOT NULL Constraints so it can’t
have duplicate values or any Null
Whereas for Oracle UNIQUE Key can have any number of NULL whereas for SQL
Server It can have only one NULL
3) A table can have only one PK but It can have any number of UNIQUE Key.
A UNIQUE constraint is similar to PRIMARY key, but you can have more than one UNIQUE constraint
per table.
When you declare a UNIQUE constraint, SQL Server creates a UNIQUE index to speed up the process of
searching for duplicates. In this case the index defaults to NONCLUSTERED index, because you can have
only one CLUSTERED index per table.
* The number of UNIQUE constraints per table is limited by the number of indexes on the table i.e 249
NONCLUSTERED index and one possible CLUSTERED index.
Contrary to PRIMARY key UNIQUE constraints can accept NULL but just once. If the constraint is
defined in a combination of fields, then every field can accept NULL and can have some values on them, as
long as the combination values is unique.

How to implement one-to-one, one-to-many and many-to-many relationships while designing tables?
One-to-One relationship can be implemented as a single table and rarely as two tables with primary and
foreign key relationships.
One-to-Many relationships are implemented by splitting the data into two tables with primary key and
foreign key relationships.
Many-to-Many relationships are implemented using a junction table with the keys from both the tables
forming the composite primary key of the junction table.

What is a NOLOCK?
Using the NOLOCK query optimizer hint is generally considered good practice in order to improve
concurrency on a busy system. When the NOLOCK hint is included in a SELECT statement, no locks are
taken when data is read. The result is a Dirty Read, which means that another process could be updating the
data at the exact time you are reading it. There are no guarantees that your query will retrieve the most
recent data. The advantage to performance is that your reading of data will not block updates from taking
place, and updates will not block your reading of data. SELECT statements take Shared (Read) locks. This
means that multiple SELECT statements are allowed simultaneous access, but other processes are blocked
from modifying the data. The updates will queue until all the reads have completed, and reads requested
after the update will wait for the updates to complete. The result to your system is delay(blocking).
NOLOCK
This table hint, also known as READUNCOMMITTED, is applicable to SELECT statements only.
NOLOCK indicates that no shared locks are issued against the table that would prohibit other transactions
from modifying the data in the table.
The benefit of the statement is that it allows you to keep the database engine from issuing locks against the
tables in your queries; this increases concurrency and performance because the database engine does not
have to maintain the shared locks involved. The downside is that, because the statement does not issue any

locks against the tables being read, some "dirty," uncommitted data could potentially be read. A "dirty" read
is one in which the data being read is involved in a transaction from another connection. If that transaction
rolls back its work, the data read from the connection using NOLOCK will have read uncommitted data.
This type of read makes processing inconsistent and can lead to problems. The trick is being able to know
when you should use NOLOCK.
As a side note, NOLOCK queries also run the risk of reading "phantom" data, or data rows that are
available in one database transaction read but can be rolled back in another. (I will take a closer look at this
side effect in part two of this article series.)
The following example shows how NOLOCK works and how dirty reads can occur. In the script below, I
begin a transaction and insert a record in the SalesHistory table.
BEGIN TRANSACTION
INSERT INTO SalesHistory
(Product, SaleDate, SalePrice)
VALUES
('PoolTable', GETDATE(), 500)
The transaction is still open, which means that the record that was inserted into the table still has locks
issued against it. In a new query window, run the following script, which uses the NOLOCK table hint in
returning the number of records in the SalesHistory table.
SELECT COUNT(*) FROM SalesHistory WITH(NOLOCK)
The number of records returned is 301. Since the transaction that entered the record into the SalesHistory
table has not been committed, I can undo it. I'll roll back the transaction by issuing the following statement:
ROLLBACK TRANSACTION
This statement removes the record from the SalesHistory table that I previously inserted. Now I run the
same SELECT statement that I ran earlier:
SELECT COUNT(*) FROM SalesHistory WITH(NOLOCK)
This time the record count returned is 300. My first query read a record that was not yet committed -- this is
a dirty read.
READPAST
This is a much less commonly used table hint than NOLOCK. This hint specifies that the database engine
not consider any locked rows or data pages when returning results.
The advantage of this table hint is that, like NOLOCK, blocking does not occur when issuing queries. In
addition, dirty reads are not present in READPAST because the hint will not return locked records. The
downside of the statement is that, because records are not returned that are locked, it is very difficult to
determine if your result set, or modification statement, includes all of the necessary rows. You may need to
include some logic in your application to ensure that all of the necessary rows are eventually included.
The READPAST table hint example is very similar to the NOLOCK table hint example. I'll begin a
transaction and update one record in the SalesHistory table.

BEGIN TRANSACTION
UPDATE TOP(1) SalesHistory
SET SalePrice = SalePrice + 1
Because I do not commit or roll back the transaction, the locks that were placed on the record that I updated
are still in effect. In a new query editor window, run the following script, which uses READPAST on the
SalesHistory table to count the number of records in the table.
SELECT COUNT(*)
FROM SalesHistory WITH(READPAST)
My SalesHistory table originally had 300 records in it. The UPDATE statement is currently locking one
record in the table. The script above that uses READPAST returns 299 records, which means that because
the record I am updating is locked, it is ignored by the READPAST hint.
ROWLOCK
Using ROWLOCK politely asks SQL Server to only use row-level locks. You can use this in SELECT,
UPDATE, and DELETE statements, but I only use it in UPDATE and DELETE statements. You'd think that
an UPDATE in which you specify the primary key would always cause a row lock, but when SQL Server
gets a batch with a bunch of these, and some of them happen to be in the same page (depending on this
situation, this can be quite likely, e.g. updating all files in a folder, files which were created at pretty much
the same time), you'll see page locks, and bad things will happen. And if you don't specify a primary key
for an UPDATE or DELETE, there's no reason the database wouldn't assume that a lot won't be affected, so
it probably goes right to page locks, and bad things happen.
By specifically requesting row-level locks, these problems are avoided. However, be aware that if you are
wrong and lots of rows are affected, either the database will take the initiative and escalate to page locks, or
you'll have a whole army of row locks filling your server's memory and bogging down processing. One
thing to be particularly aware of is the "Management/Current Activity" folder with Enterprise Manager. It
takes a long time to load information about a lot of locks. The information is valuable, and this technique is
very helpful, but don't be surprised if you see hundreds of locks in the "Locks/Processes" folder after
employing this technique. Just be glad you don't have lock timeouts or deadlocks.
Notes:
I get the sense that SQL Server honors NOLOCK requests religiously, but is more discretional with
ROWLOCK requests. You can only use NOLOCK in SELECT statements. This includes inner queries, and the
SELECT clause of the INSERT statement. You can and should use NOLOCK in joins:
SELECT COUNT(Users.UserID)
FROM Users WITH (NOLOCK)
JOIN UsersInUserGroups WITH (NOLOCK) ON
Users.UserID = UsersInUserGroups.UserID

What is difference between DELETE & TRUNCATE commands?
Delete command removes the rows from a table based on the condition that we provide with a WHERE
clause. Truncate will actually remove all the rows from a table and there will be no data in the table after
we run the truncate command.
TRUNCATE
TRUNCATE is faster and uses fewer system and transaction log resources than DELETE.

TRUNCATE removes the data by deallocating the data pages used to store the table’s data, and only the
page deallocations are recorded in the transaction log.
TRUNCATE removes all rows from a table, but the table structure and its columns, constraints, indexes and
so on remain. The counter used by an identity for new rows is reset to the seed for the column.
You cannot use TRUNCATE TABLE on a table referenced by a FOREIGN KEY constraint.
Because TRUNCATE TABLE is not logged, it cannot activate a trigger.
TRUNCATE can not be Rolled back using logs.
TRUNCATE is DDL Command.
TRUNCATE Resets identity of the table.
DELETE
DELETE removes rows one at a time and records an entry in the transaction log for each deleted row.
If you want to retain the identity counter, use DELETE instead. If you want to remove table definition and
its data, use the DROP TABLE statement.
DELETE Can be used with or without a WHERE clause
DELETE Activates Triggers.
DELETE Can be Rolled back using logs.
DELETE is DML Command.
DELETE does not reset identity of the table.

Difference between Function and Stored Procedure?
UDF can be used in the SQL statements anywhere in the WHERE/HAVING/SELECT section where as
Stored procedures cannot be.
UDFs that return tables can be treated as another rowset. This can be used in JOINs with other tables.
Inline UDF’s can be though of as views that take parameters and can be used in JOINs and other Rowset
operations.
In many instances you can accomplish the same task using either a stored procedure or a function. Both
functions and stored procedures can be custom defined and part of any application. Functions, on the other
hand, are designed to send their output to a query or T-SQL statement. For example, User Defined
Functions (UDFs) can run an executable file from SQL SELECT or an action query, while Stored
Procedures (SPROC) use EXECUTE or EXEC to run. Both are instantiated using CREATE FUNCTION.
To decide between using one of the two, keep in mind the fundamental difference between them: stored
procedures are designed to return its output to the application. A UDF returns table variables, while a
SPROC can't return a table variable although it can create a table. Another significant difference between
them is that UDFs can't change the server environment or your operating system environment, while a
SPROC can. Operationally, when T-SQL encounters an error the function stops, while T-SQL will ignore an
error in a SPROC and proceed to the next statement in your code (provided you've included error handling
support). You'll also find that although a SPROC can be used in an XML FOR clause, a UDF cannot be.
If you have an operation such as a query with a FROM clause that requires a rowset be drawn from a table
or set of tables, then a function will be your appropriate choice. However, when you want to use that same
rowset in your application the better choice would be a stored procedure.
There's quite a bit of debate about the performance benefits of UDFs vs. SPROCs. You might be tempted to
believe that stored procedures add more overhead to your server than a UDF. Depending upon how your
write your code and the type of data you're processing, this might not be the case. It's always a good idea to
text your data in important or time-consuming operations by trying both types of methods on them.
User Defined Functions are compact pieces of Transact SQL code, which can accept
parameters, and return either a value, or a table. They are saved as individual work

units, and are created using standard SQL commands. Data transformation and
reference value retrieval are common uses for functions. LEFT, the built in function
for getting the left part of a string, and GETDATE, used for obtaining the current date
and time, are two examples of function use. User Defined Functions enable the
developer or DBA to create functions of their own, and save them inside SQL Server.

When is the use of UPDATE_STATISTICS command?
This command is basically used when a large processing of data has occurred. If a large amount of
deletions any modification or Bulk Copy into the tables has occurred, it has to update the indexes to take
these changes into account. UPDATE_STATISTICS updates the indexes on these tables accordingly.

What are the difference between DDL, DML and DCL commands?
DDL - Data Definition Language: statements used to define the database structure or schema. Some
examples:







CREATE - to create objects in the database
ALTER - alters the structure of the database
DROP - delete objects from the database
TRUNCATE - remove all records from a table, including all spaces allocated for the records are
removed
COMMENT - add comments to the data dictionary
RENAME - rename an object

DML - Data Manipulation Language: statements used for managing data within schema objects. Some
examples:









SELECT - retrieve data from the a database
INSERT - insert data into a table
UPDATE - updates existing data within a table
DELETE - deletes all records from a table, the space for the records remain
MERGE - UPSERT operation (insert or update)
CALL - call a PL/SQL or Java subprogram
EXPLAIN PLAN - explain access path to data
LOCK TABLE - control concurrency

DCL - Data Control Language. Some examples:



GRANT - gives user's access privileges to database
REVOKE - withdraw access privileges given with the GRANT command

TCL - Transaction Control: statements used to manage the changes made by DML statements. It allows
statements to be grouped together into logical transactions.



COMMIT - save work done
SAVEPOINT - identify a point in a transaction to which you can later roll back




ROLLBACK - restore database to original since the last COMMIT
SET TRANSACTION - Change transaction options like isolation level and what rollback segment
to use

DML are not auto-commit. i.e. you can roll-back the operations, but DDL are auto-commit
Difference between TRUNCATE, DELETE and DROP commands?
The DELETE command is used to remove some or all rows from a table. A WHERE clause can be used
to only remove some rows. If no WHERE condition is specified, all rows will be removed. After
performing a DELETE operation you need to COMMIT or ROLLBACK the transaction to make the
change permanent or to undo it. Note that this operation will cause all DELETE triggers on the table to fire.
SQL> SELECT COUNT(*) FROM emp;
COUNT(*)
---------14
SQL> DELETE FROM emp WHERE job = 'CLERK';
4 rows deleted.
SQL> COMMIT;
Commit complete.
SQL> SELECT COUNT(*) FROM emp;
COUNT(*)
---------10

TRUNCATE removes all rows from a table. The operation cannot be rolled back and no triggers will be
fired. As such, TRUNCATE is faster and doesn't use as much undo space as a DELETE.
SQL> TRUNCATE TABLE emp;
Table truncated.
SQL> SELECT COUNT(*) FROM emp;
COUNT(*)
---------0
The DROP command removes a table from the database. All the tables' rows, indexes and privileges will
also be removed. No DML triggers will be fired. The operation cannot be rolled back.
SQL> DROP TABLE emp;
Table dropped.
SQL> SELECT * FROM emp;
SELECT * FROM emp
*
ERROR at line 1:
ORA-00942: table or view does not exist

DROP and TRUNCATE are DDL commands, whereas DELETE is a DML command. Therefore DELETE
operations can be rolled back (undone), while DROP and TRUNCATE operations cannot be rolled back.
From Oracle

10g a table can be "undropped". Example:

SQL> FLASHBACK TABLE emp TO BEFORE DROP;
Flashback complete.
PS: DELETE will not free up used space within a table. This means that repeated DELETE commands will
severely fragment the table and queries will have to navigate this "free space" in order to retrieve rows.

What types of Joins are possible with Sql Server?
Joins are used in queries to explain how different tables are related. Joins also let you select data from a
table depending upon data from another table.
Types of joins: INNER JOINs, OUTER JOINs, CROSS JOINs. OUTER JOINs are further classified as
LEFT OUTER JOINS, RIGHT OUTER JOINS and FULL OUTER JOINS.
What is the difference between a HAVING CLAUSE and a WHERE CLAUSE?
Specifies a search condition for a group or an aggregate. HAVING can be used only with the SELECT
statement. HAVING is typically used in a GROUP BY clause. When GROUP BY is not used, HAVING
behaves like a WHERE clause. Having Clause is basically used only with the GROUP BY function in a
query. WHERE Clause is applied to each row before they are part of the GROUP BY function in a query.
HAVING criteria is applied after the the grouping of rows has occurred.
What is sub-query? Explain properties of sub-query.
Sub-queries are often referred to as sub-selects, as they allow a SELECT statement to be executed
arbitrarily within the body of another SQL statement. A sub-query is executed by enclosing it in a set of
parentheses. Sub-queries are generally used to return a single row as an atomic value, though they may be
used to compare values against multiple rows with the IN keyword.
A subquery is a SELECT statement that is nested within another T-SQL statement. A subquery SELECT
statement if executed independently of the T-SQL statement, in which it is nested, will return a result set.
Meaning a subquery SELECT statement can standalone and is not depended on the statement in which it is
nested. A subquery SELECT statement can return any number of values, and can be found in, the column
list of a SELECT statement, a FROM, GROUP BY, HAVING, and/or ORDER BY clauses of a T-SQL
statement. A Subquery can also be used as a parameter to a function call. Basically a subquery can be used
anywhere an expression can be used.
Properties of Sub-Query
A subquery must be enclosed in the parenthesis.
A sub query must be put in the right hand of the comparison operator, and
A sub query cannot contain a ORDER-BY clause.
A query can contain more than one sub-queries.
What are types of sub-queries?
Single-row subquery, where the subquery returns only one row.
Multiple-row subquery, where the subquery returns multiple rows,.and
Multiple column subquery, where the subquery returns multiple columns.

What is SQL Profiler?
SQL Profiler is a graphical tool that allows system administrators to monitor events in an instance of
Microsoft SQL Server. You can capture and save data about each event to a file or SQL Server table to
analyze later. For example, you can monitor a production environment to see which stored procedures are
hampering performance by executing too slowly.
Use SQL Profiler to monitor only the events in which you are interested. If traces are becoming too large,
you can filter them based on the information you want, so that only a subset of the event data is collected.
Monitoring too many events adds overhead to the server and the monitoring process and can cause the trace
file or trace table to grow very large, especially when the monitoring process takes place over a long period
of time.
What is User Defined Functions?
User-Defined Functions allow to define its own T-SQL functions that can accept 0 or more parameters and
return a single scalar data value or a table data type.
What kind of User-Defined Functions can be created?
There are three types of User-Defined functions in SQL Server 2000 and they are Scalar, Inline TableValued and Multi-statement Table-valued.
Scalar User-Defined Function
A Scalar user-defined function returns one of the scalar data types. Text, ntext, image and timestamp data
types are not supported. These are the type of user-defined functions that most developers are used to in
other programming languages. You pass in 0 to many parameters and you get a return value.
Inline Table-Value User-Defined Function
An Inline Table-Value user-defined function returns a table data type and is an exceptional alternative to a
view as the user-defined function can pass parameters into a T-SQL select command and in essence provide
us with a parameterized, non-updateable view of the underlying tables.
Multi-statement Table-Value User-Defined Function
A Multi-Statement Table-Value user-defined function returns a table and is also an exceptional alternative
to a view as the function can support multiple T-SQL statements to build the final result where the view is
limited to a single SELECT statement. Also, the ability to pass parameters into a T-SQL select command or
a group of them gives us the capability to in essence create a parameterized, non-updateable view of the
data in the underlying tables. Within the create function command you must define the table structure that
is being returned. After creating this type of user-defined function, It can be used in the FROM clause of a
T-SQL command unlike the behavior found when using a stored procedure which can also return record
sets.
Which TCP/IP port does SQL Server run on? How can it be changed?
SQL Server runs on port 1433. It can be changed from the Network Utility TCP/IP properties –> Port
number. both on client and the server.
What are the authentication modes in SQL Server? How can it be changed?
Windows mode and mixed mode (SQL & Windows).
To change authentication mode in SQL Server click Start, Programs, Microsoft SQL Server and
click SQL Enterprise Manager to run SQL Enterprise Manager from the Microsoft SQL
Server program group. Select the server then from the Tools menu select SQL Server
Configuration Properties, and choose the Security page.

Where are SQL server users names and passwords are stored in sql server?
They get stored in master db in the sysxlogins table.
Which command using Query Analyzer will give you the version of SQL server and operating
system?
SELECT
),

SERVERPROPERTY(‘productversion’), SERVERPROPERTY (‘productlevel’

What is SQL server agent?
SQL Server agent plays an important role in the day-to-day tasks of a database administrator (DBA). It is
often overlooked as one of the main tools for SQL Server management. Its purpose is to ease the
implementation of tasks for the DBA, with its full-function scheduling engine, which allows you to
schedule your own jobs and scripts.
SQL Server Agent is a Microsoft Windows service that executes scheduled administrative tasks, which are
called jobs. SQL Server Agent uses SQL Server to store job information. Jobs contain one or more job
steps. Each step contains its own task, for example, backing up a database. SQL Server Agent can run a job
on a schedule, in response to a specific event, or on demand. For example, if you want to back up all the
company servers every weekday after hours, you can automate this task. Schedule the backup to run after
22:00 Monday through Friday; if the backup encounters a problem, SQL Server Agent can record the event
and notify you.
The SQL Server Agent is a service that lets you configure scheduled tasks and system alerts. SQL Server
Agent runs continuously in the background as a Windows Service.
The SQL Server Agent is made up of the following components:
Co
mpo Description
nent
SQL jobs consist of one or more steps to be executed. Each step consists of a SQL statement. SQL
Jobs
Agent Jobs can be scheduled to run at specified times or at specified intervals.
SQL Alerts consist of a set of actions to occur when a specific event occurs (such as when a particular
Aler
error occurs, or the database reaches a defined size). Alerts can include sending an email to the
ts
administrator, paging the administrator, or running a job to fix the problem.
Ope
Operators are people who can address problems with SQL Server. Operators can be identified through
rato
their network account or their email identifier. These are usually the people who alerts are sent to.
rs

Can a stored procedure call itself or recursive stored procedure? How many level SP nesting
possible?
Yes. Because Transact-SQL supports recursion, you can write stored procedures that call themselves.
Recursion can be defined as a method of problem solving wherein the solution is arrived at by repetitively
applying it to subsets of the problem. A common application of recursive logic is to perform numeric
computations that lend themselves to repetitive evaluation by the same processing steps. Stored procedures
are nested when one stored procedure calls another or executes managed code by referencing a CLR
routine, type, or aggregate. You can nest stored procedures and managed code references up to 32 levels.

What is @@ERROR?
The @@ERROR automatic variable returns the error code of the last Transact-SQL statement. If there was
no error, @@ERROR returns zero. Because @@ERROR is reset after each Transact-SQL statement, it
must be saved to a variable if it is needed to process it further after checking it.
What is Raiseerror?
Stored procedures report errors to client applications via the RAISERROR command. RAISERROR
doesn’t change the flow of a procedure; it merely displays an error message, sets the @@ERROR
automatic variable, and optionally writes the message to the SQL Server error log and the NT application
event log.
What is log shipping?
Log shipping is the process of automating the backup of database and transaction log
files on a production SQL server, and then restoring them onto a standby server.
Enterprise Editions only supports log shipping. In log shipping the transactional log
file from one server is automatically updated into the backup database on the other
server. If one server fails, the other server will have the same db can be used this as
the Disaster Recovery plan. The key feature of log shipping is that is will
automatically backup transaction logs throughout the day and automatically restores
them on the standby server at defined interval.
What is the difference between a local and a global variable?
A local temporary table exists only for the duration of a connection or, if defined inside a compound
statement, for the duration of the compound statement.
A global temporary table remains in the database permanently, but the rows exist only within a given
connection. When connection are closed, the data in the global temporary table disappears. However, the
table definition remains with the database for access when database is opened next time.
What command do we use to rename a db?
sp_renamedb ‘oldname’ , ‘newname’
If someone is using db it will not accept sp_renmaedb. In that case first bring db to single user using
sp_dboptions. Use sp_renamedb to rename database. Use sp_dboptions to bring database to multi user
mode.
What is sp_configure commands and set commands?
Use sp_configure to display or change server-level settings. To change database-level settings, use ALTER
DATABASE. To change settings that affect only the current user session, use the SET statement.

What is replication?
Replication is the process of sharing data between databases in different locations. Using replication, you
create copies of the Database and share the copy with different users so that they can make changes to
their local copy of the database and later synchronize the changes to the source database.
What are the different types of replication? Explain.
The SQL Server 2000-supported replication types are as follows:




Transactional
Snapshot
Merge

Snapshot replication distributes data exactly as it appears at a specific moment in time and does not
monitor for updates to the data. Snapshot replication is best used as a method for replicating data that
changes infrequently or where the most up-to-date values (low latency) are not a requirement. When
synchronization occurs, the entire snapshot is generated and sent to Subscribers.
Transactional replication, an initial snapshot of data is applied at Subscribers, and then when data
modifications are made at the Publisher, the individual transactions are captured and propagated to
Subscribers.
Merge replication is the process of distributing data from Publisher to Subscribers, allowing the Publisher
and Subscribers to make updates while connected or disconnected, and then merging the updates between
sites when they are connected.
What are the OS services that the SQL Server installation adds?
MS SQL SERVER SERVICE, SQL AGENT SERVICE, DTC (Distribution transac co-ordinator)
What are three SQL keywords used to change or set someone’s permissions?
GRANT, DENY, and REVOKE.
What does it mean to have quoted_identifier on? What are the implications of having it off?
When SET QUOTED_IDENTIFIER is ON, identifiers can be delimited by double
quotation marks, and literals must be delimited by single quotation marks. When SET
QUOTED_IDENTIFIER is OFF, identifiers cannot be quoted and must follow all
Transact-SQL rules for identifiers.
What is the STUFF function and how does it differ from the REPLACE function?
STUFF function to overwrite existing characters. Using this syntax, STUFF(string_expression, start,
length, replacement_characters), string_expression is the string that will have characters substituted, start is
the starting position, length is the number of characters in the string that are substituted, and
replacement_characters are the new characters interjected into the string.
REPLACE function to replace existing characters of all occurance. Using this syntax
REPLACE(string_expression, search_string, replacement_string), where every incidence of search_string
found in the string_expression will be replaced with replacement_string.
Using query analyzer, name 3 ways to get an accurate count of the number of records in a table?
SELECT *
FROM table1
SELECT COUNT(*)
FROM table1
SELECT rows
FROM sysindexes
WHERE id = OBJECT_ID(‘table1’)
AND indid < 2
How to rebuild Master Database?
Shutdown Microsoft SQL Server 2000, and then run Rebuildm.exe. This is located in the Program
Files\Microsoft SQL Server\80\Tools\Binn directory.
In the Rebuild Master dialog box, click Browse.
In the Browse for Folder dialog box, select the \Data folder on the SQL Server 2000 compact disc or in the
shared network directory from which SQL Server 2000 was installed, and then click OK.
Click Settings. In the Collation Settings dialog box, verify or change settings used for the master database
and all other databases.
Initially, the default collation settings are shown, but these may not match the collation selected during

setup. You can select the same settings used during setup or select new collation settings. When done, click
OK.
In the Rebuild Master dialog box, click Rebuild to start the process.
The Rebuild Master utility reinstalls the master database.
To continue, you may need to stop a server that is running.
Source: http://msdn2.microsoft.com/en-us/library/aa197950(SQL.80).aspx
What is the basic functions for master, msdb, model, tempdb databases?
The Master database holds information for all databases located on the SQL Server instance and is the glue
that holds the engine together. Because SQL Server cannot start without a functioning master database, you
must administer this database with care.
The msdb database stores information regarding database backups, SQL Agent information, DTS packages,
SQL Server jobs, and some replication information such as for log shipping.
The tempdb holds temporary objects such as global and local temporary tables and stored procedures.
The model is essentially a template database used in the creation of any new user database created in the
instance.
What are primary keys and foreign keys?
Primary keys are the unique identifiers for each row. They must contain unique values and cannot be null.
Due to their importance in relational databases, Primary keys are the most fundamental of all keys and
constraints. A table can have only one Primary key.
Foreign keys are both a method of ensuring data integrity and a manifestation of the relationship between
tables.
What is data integrity? Explain constraints?
Data integrity is an important feature in SQL Server. When used properly, it ensures that data is accurate,
correct, and valid. It also acts as a trap for otherwise undetectable bugs within applications.
A PRIMARY KEY constraint is a unique identifier for a row within a database table.
Every table should have a primary key constraint to uniquely identify each row and
only one primary key constraint can be created for each table. The primary key
constraints are used to enforce entity integrity.
A UNIQUE constraint enforces the uniqueness of the values in a set of columns, so
no duplicate values are entered. The unique key constraints are used to enforce
entity integrity as the primary key constraints.
A FOREIGN KEY constraint prevents any actions that would destroy links between
tables with the corresponding data values. A foreign key in one table points to a
primary key in another table. Foreign keys prevent actions that would leave rows
with foreign key values when there are no primary keys with that value. The foreign
key constraints are used to enforce referential integrity.
A CHECK constraint is used to limit the values that can be placed in a column. The
check constraints are used to enforce domain integrity.
A NOT NULL constraint enforces that the column will not accept null values. The not
null constraints are used to enforce domain integrity, as the check constraints.
What are the properties of the Relational tables?
Relational tables have six properties:



Values are atomic.
Column values are of the same kind.






Each row is unique.
The sequence of columns is insignificant.
The sequence of rows is insignificant.
Each column must have a unique name.

What is De-normalization?
De-normalization is the process of attempting to optimize the performance of a database by adding
redundant data. It is sometimes necessary because current DBMSs implement the relational model poorly.
A true relational DBMS would allow for a fully normalized database at the logical level, while providing
physical storage of data that is tuned for high performance. De-normalization is a technique to move from
higher to lower normal forms of database modeling in order to speed up database access.
How to get @@error and @@rowcount at the same time?
If @@Rowcount is checked after Error checking statement then it will have 0 as the value of
@@Recordcount as it would have been reset.
And if @@Recordcount is checked before the error-checking statement then @@Error would get reset. To
get @@error and @@rowcount at the same time do both in same statement and store them in local
variable. SELECT @RC = @@ROWCOUNT, @ER = @@ERROR
What is Identity?
Identity (or AutoNumber) is a column that automatically generates numeric values. A start and increment
value can be set, but most DBA leave these at 1. A GUID column also generates numbers, the value of this
cannot be controled. Identity/GUID columns do not need to be indexed.
What is a Scheduled Jobs or What is a Scheduled Tasks?
Scheduled tasks let user automate processes that run on regular or predictable cycles. User can schedule
administrative tasks, such as cube processing, to run during times of slow business activity. User can also
determine the order in which tasks run by creating job steps within a SQL Server Agent job. E.g. Back up
database, Update Stats of Tables. Job steps give user control over flow of execution. If one job fails, user
can configure SQL Server Agent to continue to run the remaining tasks or to stop execution.
What is a table called, if it does not have neither Cluster nor Non-cluster Index? What is it used for?
Unindexed table or Heap. Microsoft Press Books and Book On Line (BOL) refers it as Heap.
A heap is a table that does not have a clustered index and, therefore, the pages are not linked by pointers.
The IAM pages are the only structures that link the pages in a table together.
Unindexed tables are good for fast storing of data. Many times it is better to drop all indexes from table and
than do bulk of inserts and to restore those indexes after that.
What is BCP? When does it used?
BulkCopy is a tool used to copy huge amount of data from tables and views. BCP does not copy the
structures same as source to destination.
How do you load large data to the SQL server database?
BulkCopy is a tool used to copy huge amount of data from tables. BULK INSERT command helps to
Imports a data file into a database table or view in a user-specified format.
Can we rewrite subqueries into simple select statements or with joins?
Subqueries can often be re-written to use a standard outer join, resulting in faster performance. As we may
know, an outer join uses the plus sign (+) operator to tell the database to return all non-matching rows with
NULL values. Hence we combine the outer join with a NULL test in the WHERE clause to reproduce the
result set without using a sub-query.

Can SQL Servers linked to other servers like Oracle?
SQL Server can be lined to any server provided it has OLE-DB provider from Microsoft to allow a link.
E.g. Oracle has a OLE-DB provider for oracle that Microsoft provides to add it as linked server to SQL
Server group.
How to know which index a table is using?
SELECT table_name,index_name FROM user_constraints
How to copy the tables, schema and views from one SQL server to another?
Microsoft SQL Server 2000 Data Transformation Services (DTS) is a set of graphical tools and
programmable objects that lets user extract, transform, and consolidate data from disparate sources into
single or multiple destinations.
What is a join and List different types of joins.
Joins are used in queries to explain how different tables are related. Joins also let you select data from a
table depending upon data from another table. Types of joins: INNER JOINs, OUTER JOINs, CROSS
JOINs. OUTER JOINs are further classified as LEFT OUTER JOINS, RIGHT OUTER JOINS and FULL
OUTER JOINS.
1. Inner Join: Inner Join is the default type of join, it
will producesses the result set, which contains matched
rows only.
syntax: select * from table1<innerjoin>table2
2. Outer Join: Outer join produces the results, which
contains matched rows and unmatched rows.
here we have three types of joins,
1.Left Outer Join 2.Right Outer Join 3.Full Outer Join
Left Outer Join: Left Outer Join producesses the results,
which contains all the rows from Left table and matched
rows from Right Table.
syntax: select * from table1<leftouterjoin>table2
Right Outer Join: Right Outer Join producesses the
resultset, which contains all the rows from right table and
matched rows from left table.
syntax:select * from table1<right outer join>table2
Full Outer Join: Full Outer Join producesses the resultset,
which contains all the rows from left table and all the
rows from right table.
syntax:select * from table1<fullouterjoin>table2
3.Cross Join: A join without having any condition is known
as Cross Join, in cross join every row in first table is
joins with every row in second table.
syntax: select * from table1<cross join>table2
Self Join: A join joins withitself is called self join
working with self joins we use Alias tables.
What is Self Join?
This is a particular case when one table joins to itself, with one or two aliases to avoid confusion. A self
join can be of any type, as long as the joined tables are the same. A self join is rather unique in that it

involves a relationship with only one table. The common example is when company have a hierarchal
reporting structure whereby one member of staff reports to another.
What is Cross Join?
A cross join that does not have a WHERE clause produces the Cartesian product of the tables involved in
the join. The size of a Cartesian product result set is the number of rows in the first table multiplied by the
number of rows in the second table. The common example is when company wants to combine each
product with a pricing table to analyze each product at each price.
Which virtual table does a trigger use?
Inserted and Deleted.
List few advantages of Stored Procedure.






Stored procedure can reduced network traffic and latency, boosting application performance.
Stored procedure execution plans can be reused, staying cached in SQL Server’s memory,
reducing server overhead.
Stored procedures help promote code reuse.
Stored procedures can encapsulate logic. You can change stored procedure code without affecting
clients.
Stored procedures provide better security to your data.

What is DataWarehousing?





Subject-oriented, meaning that the data in the database is organized so that all the data elements
relating to the same real-world event or object are linked together;
Time-variant, meaning that the changes to the data in the database are tracked and recorded so
that reports can be produced showing changes over time;
Non-volatile, meaning that data in the database is never over-written or deleted, once committed,
the data is static, read-only, but retained for future reporting;
Integrated, meaning that the database contains data from most or all of an organization’s
operational applications, and that this data is made consistent.

What is OLTP(OnLine Transaction Processing)?
In OLTP - online transaction processing systems relational database design use the discipline of data
modeling and generally follow the Codd rules of data normalization in order to ensure absolute data
integrity. Using these rules complex information is broken down into its most simple structures (a table)
where all of the individual atomic level elements relate to each other and satisfy the normalization rules.
How do SQL server 2000 and XML linked? Can XML be used to access data?
FOR XML (ROW, AUTO, EXPLICIT)
You can execute SQL queries against existing relational databases to return results as XML rather than
standard rowsets. These queries can be executed directly or from within stored procedures. To retrieve
XML results, use the FOR XML clause of the SELECT statement and specify an XML mode of RAW,
AUTO, or EXPLICIT.
OPENXML
OPENXML is a Transact-SQL keyword that provides a relational/rowset view over an in-memory XML
document. OPENXML is a rowset provider similar to a table or a view. OPENXML provides a way to
access XML data within the Transact-SQL context by transferring data from an XML document into the
relational tables. Thus, OPENXML allows you to manage an XML document and its interaction with the
relational environment.

What is an execution plan? When would you use it? How would you view the execution plan?
An execution plan is basically a road map that graphically or textually shows the data retrieval methods
chosen by the SQL Server query optimizer for a stored procedure or ad-hoc query and is a very useful tool
for a developer to understand the performance characteristics of a query or stored procedure since the plan
is the one that SQL Server will place in its cache and use to execute the stored procedure or query. From
within Query Analyzer is an option called “Show Execution Plan” (located on the Query drop-down menu).
If this option is turned on it will display query execution plan in separate window when query is ran again.

SQL Transaction?
CREATE PROCEDURE DeleteDepartment
(
@DepartmentID
int
)
AS
-- This sproc performs two DELETEs. First it deletes all of the
-- department's associated employees. Next, it deletes the department.
-- STEP 1: Start the transaction
BEGIN TRANSACTION
-- STEP 2 & 3: Issue the DELETE statements, checking @@ERROR after each
statement
DELETE FROM Employees
WHERE DepartmentID = @DepartmentID
-- Rollback the transaction if there were any errors
IF @@ERROR <> 0
BEGIN
-- Rollback the transaction
ROLLBACK

1)

-- Raise an error and return
RAISERROR ('Error in deleting employees in DeleteDepartment.', 16,
RETURN

END

DELETE FROM Departments
WHERE DepartmentID = @DepartmentID
-- Rollback the transaction if there were any errors
IF @@ERROR <> 0
BEGIN
-- Rollback the transaction
ROLLBACK

1)

-- Raise an error and return
RAISERROR ('Error in deleting department in DeleteDepartment.', 16,
RETURN

END

-- STEP 4: If we reach this point, the commands completed successfully
-Commit the transaction....
COMMIT
SQL Transaction in .net?
using (SqlConnection connection =
new SqlConnection(connectionString))
{
SqlCommand command = connection.CreateCommand();
SqlTransaction transaction = null;
try
{

// BeginTransaction() Requires Open Connection
connection.Open();
transaction = connection.BeginTransaction();
// Assign Transaction to Command
command.Transaction = transaction;
// Execute 1st Command
command.CommandText = "Insert ...";
command.ExecuteNonQuery();
// Execute 2nd Command
command.CommandText = "Update...";
command.ExecuteNonQuery();

transaction.Commit();
}
catch
{
transaction.Rollback();
throw;
}
finally
{
connection.Close();
}
}

How to count number of tables?
select count(*) from sysobjects where xtype='U'
will return the number of tables (not including system tables)
in the database you are cuurrently attached to.

How do you find the Second highest Salary?
SELECT MAX(SALARY) FROM EMPLOYEE WHERE SALARY NOT IN (SELECT MAX(SALARY)
FROM EMPLOYEE)
select top 1 * from (select top 2 * from users order by userid desc) a
order by userid asc
The first sub-query in the WHERE clause will return the MAX SALARY in the table, the main query
SELECT’s the MAX SALARY from the results which doesn’t have the highest SALARY.
SELECT TOP 1 stud_name,stud_marks
FROM (SELECT TOP 2 stud_name, stud_marks FROM student ORDER BY stud_marks DESC) a
ORDER BY stud_marks ASC
Select Top 1 salary from emp where salary
Not IN(select Top 1 salary from emp order by salary Desc)order by salary Desc
Write a SQL Query to find first Week Day of month?
SELECT DATENAME(dw, DATEADD(dd, - DATEPART(dd, GETDATE()) + 1, GETDATE())) AS
FirstDay
How to find 6th highest salary from Employee table
SELECT TOP 1 salary FROM (SELECT DISTINCT TOP 6 salary FROM employee ORDER BY
salary DESC) a ORDER BY salary
How can I enforce to use particular index?
You can use index hint (index=index_name) after the table name. SELECT au_lname FROM authors
(index=aunmind)
What is sorting and what is the difference between sorting and clustered indexes?
The ORDER BY clause sorts query results by one or more columns up to 8,060 bytes. This will
happen by the time when we retrieve data from database. Clustered indexes physically sorting data,
while inserting/updating the table.
What are the differences between UNION and JOINS?
A join selects columns from 2 or more tables. A union selects rows.
What is the Referential Integrity?
Referential integrity refers to the consistency that must be maintained between primary and foreign keys,
i.e. every foreign key value must have a corresponding primary key value
What is the row size in SQL Server 2000?
8060 bytes.
What is the use of SCOPE_IDENTITY() function?
Returns the most recently created identity value for the tables in the current execution scope.
What are the different ways of moving data/databases between servers and databases in SQL Server?
There are lots of options available, you have to choose your option depending upon your requirements.
Some of the options you have are: BACKUP/RESTORE, detaching and attaching databases, replication,
DTS, BCP, logshipping, INSERT...SELECT, SELECT...INTO, creating INSERT scripts to generate data.

How do you transfer data from text file to database (other than DTS)?
Using the BCP (Bulk Copy Program) utility.
What is a deadlock?
Deadlock is a situation when two processes, each having a lock on one piece of data, attempt to acquire a
lock on the other's piece. Each process would wait indefinitely for the other to release the lock, unless one
of the user processes is terminated. SQL Server detects deadlocks and terminates one user's process.

d
What is a LiveLock?
A livelock is one, where a request for an exclusive lock is repeatedly denied because a series of overlapping
shared locks keeps interfering. SQL Server detects the situation after four denials and refuses further shared
locks. A livelock also occurs when read transactions monopolize a table or page, forcing a write transaction
to wait indefinitely.
How to restart SQL Server in single user mode?
From Startup Options :- Go to SQL Server Properties by right-clicking on the Server name in the Enterprise
manager. Under the 'General' tab, click on 'Startup Parameters'. Enter a value of -m in the Parameter.
Does SQL Server 2000 clustering support load balancing?
SQL Server 2000 clustering does not provide load balancing; it provides failover support. To achieve load
balancing, you need software that balances the load between clusters, not between servers within a cluster.
What is DTC?
The Microsoft Distributed Transaction Coordinator (MS DTC) is a transaction manager that allows client
applications to include several different sources of data in one transaction. MS DTC coordinates
committing the distributed transaction across all the servers enlisted in the transaction.
What is DTS?
Microsoft� SQL Server� 2000 Data Transformation Services (DTS) is a set of graphical tools and
programmable objects that lets you extract, transform, and consolidate data from disparate sources into
single or multiple destinations.
What are defaults? Is there a column to which a default can't be bound?
A default is a value that will be used by a column, if no value is supplied to that column while inserting
data. IDENTITY columns and timestamp columns can't have defaults bound to them.
What are the constraints ?
Table Constraints define rules regarding the values allowed in columns and are the standard mechanism for
enforcing integrity. SQL Server 2000 supports five classes of constraints. NOT NULL , CHECK, UNIQUE,
PRIMARY KEY, FOREIGN KEY.
What is Transaction?
A transaction is a sequence of operations performed as a single logical unit of work. A logical unit of work
must exhibit four properties, called the ACID (Atomicity, Consistency, Isolation, and Durability) properties,
to qualify as a transaction.
What is Isolation Level?
An isolation level determines the degree of isolation of data between concurrent transactions. The default
SQL Server isolation level is Read Committed. A lower isolation level increases concurrency, but at the
expense of data correctness. Conversely, a higher isolation level ensures that data is correct, but can affect
concurrency negatively. The isolation level required by an application determines the locking behavior SQL
Server uses. SQL-92 defines the following isolation levels, all of which are supported by SQL Server:
Read uncommitted (the lowest level where transactions are isolated only enough to ensure that physically

corrupt data is not read).
Read committed (SQL Server default level).
Repeatable read.
Serializable (the highest level, where transactions are completely isolated from one another).
What is denormalization and when would you go for it?
As the name indicates, denormalization is the reverse process of normalization. It's the controlled
introduction of redundancy in to the database design. It helps improve the query performance as the number
of joins could be reduced.
What are user defined datatypes and when you should go for them?
User defined datatypes let you extend the base SQL Server datatypes by providing a descriptive name, and
format to the database. Take for example, in your database, there is a column called Flight_Num which
appears in many tables. In all these tables it should be varchar(8). In this case you could create a user
defined datatype called Flight_num_type of varchar(8) and use it across all your tables.
See sp_addtype, sp_droptype in books online.
What is bit datatype and what's the information that can be stored inside a bit column?
Bit datatype is used to store boolean information like 1 or 0 (true or false). Untill SQL Server 6.5 bit
datatype could hold either a 1 or 0 and there was no support for NULL. But from SQL Server 7.0 onwards,
bit datatype can represent a third state, which is NULL.
Define candidate key, alternate key, composite key.
A candidate key is one that can identify each row of a table uniquely. Generally a candidate key becomes
the primary key of the table. If the table has more than one candidate key, one of them will become the
primary key, and the rest are called alternate keys.
A key formed by combining at least two or more columns is called composite key.
What are defaults? Is there a column to which a default can't be bound?
A default is a value that will be used by a column, if no value is supplied to that column while inserting
data. IDENTITY columns and timestamp columns can't have defaults bound to them. See CREATE
DEFUALT in books online.
CREATE INDEX myIndex ON myTable(myColumn)
What type of Index will get created after executing the above statement?
Non-clustered index. Important thing to note: By default a clustered index gets created on the primary key,
unless specified otherwise.
What is lock escalation?
Lock escalation is the process of converting a lot of low level locks (like row locks, page locks) into higher
level locks (like table locks). Every lock is a memory structure too many locks would mean, more memory
being occupied by locks. To prevent this from happening, SQL Server escalates the many fine-grain locks
to fewer coarse-grain locks. Lock escalation threshold was definable in SQL Server 6.5, but from SQL
Server 7.0 onwards it's dynamically managed by SQL Server.

What is RAID and what are different types of RAID configurations?
RAID stands for Redundant Array of Inexpensive Disks, used to provide fault tolerance to database servers.
There are six RAID levels 0 through 5 offering different levels of performance, fault tolerance. MSDN has
some information about RAID levels and for detailed information, check out the RAID advisory board's
homepage.

Why is a UNION ALL faster than a UNION?
UNION ALL faster than a UNION because for union operation server needs to remove the duplicate values
but for union all its not. Thats why the UNOIN ALL is fater than UNION Operation. It is recommended
that if you know that the union set operation never returns duplicate values than you must use UNION ALL
instead of UNION.
How many types of data models are there?
There are no standards in this area. Authors and theorists make it up as they go. The entity-relationship
model (ER) has hundreds of derivitives (bachman, chen, ibm, IDEF1x etc.). the most popular of the OO
models is Unified Modeling Language (UML). Actually UML and IDEF1x are closest to becoming a
standard that can support software products. Rational already has products and IDEF1x is the language of
ERwin.
Don't be fooled by these variations. They all represent the same things, you have to be very careful that you
understand all of the non-standard symbols or you will surely make mistakes in interpreting what the
pictures mean.
How do you troubleshoot SQL Server if its running very slow?
First check the processor and memory usage to see that processor is not above 80% utilization and memory
not above 40-45% utilization then check the disk utilization using Performance Monitor, Secondly, use
SQL Profiler to check for the users and current SQL activities and jobs running which might be a problem.
Third would be to run UPDATE_STATISTICS command to update the indexes.
Let us say the SQL Server crashed and you are rebuilding the databases including the master
database what procedure to you follow?
For restoring the master db we have to stop the SQL Server first and then from command line we can type
SQLSERVER .m which will basically bring it into the maintenance mode after which we can restore the
master db.
Explain Active/Active and Active/Passive cluster configurations
Hopefully you have experience setting up cluster servers. But if you don't, at least be familiar with the way
clustering works and the two clustering configurations Active/Active and Active/Passive. SQL Server
books online has enough information on this topic and there is a good white paper available on Microsoft
site.
Explain CREATE DATABASE syntax
Many of us are used to craeting databases from the Enterprise Manager or by just issuing the command:
CREATE DATABAE MyDB. But what if you have to create a database with two filegroups, one on drive C
and the other on drive D with log on drive E with an initial size of 600 MB and with a growth factor of
15%? That's why being a DBA you should be familiar with the CREATE DATABASE syntax. Check out
SQL Server books online for more information.

Write down the general syntax for a SELECT statements covering all the options
Here's the basic syntax: (Also checkout SELECT in books online for advanced syntax).
SELECT select_list
[INTO new_table_]
FROM table_source
[WHERE search_condition]
[GROUP BY group_by_expression]
[HAVING search_condition]
[ORDER BY order_expression [ASC | DESC] ]

Can you have a nested transaction?
Yes, very much. Check out BEGIN TRAN, COMMIT, ROLLBACK, SAVE TRAN and
@@TRANCOUNT

What is the system function to get the current user's user id?
USER_ID().Also check out other system functions like USER_NAME(), SYSTEM_USER,
SESSION_USER, CURRENT_USER, USER, SUSER_SID(), HOST_NAME().
What are the types of backup and tell me the difference between full and differential backup?
Full Backups include all data within the backup scope. For
example, a full database backup will include all data in the
database, regardless of when it was last created or
modified. Similarly, a full partial backup will include the
entire contents of every file and filegroup within the scope
of that partial backup.
Differential Backups include only that portion of the data
that has changed since the last full backup. For example, if
you perform a full database backup on Monday morning and
then perform a differential database backup on Monday
evening, the differential backup will be a much smaller file
(that takes much less time to create) that includes only the
data changed during the day on Monday.

How can u convert the Date to String?
select convert(varchar,columnname,109)
select convert(varchar(20),getdate())
HOW TO FIND THE EMPLOYEE DETAILS WHO ARE GETTING SAME SALARY IN EMP
TABLE
SELECT * FROM EMP WHERE (SAL IN (SELECT sal
FROM emp GROUP BY sal HAVING COUNT(sal) > 1))
Can anybody explain me cold backup and hot backup?

Cold Backup and Hot Backup terms are used by Oracle. These
terms are not available In MS SQL Server.
Cold Backup: - Takes the Database offline and copy database
files to different loction is called cold backup in Oracle.
Hot Backup:- Taking the Database backup when the Database
is online.
Which databases are part of SQL server default installation? Explain the usage of each?
4 key default dbs :Master db : Holds info of all dbs located on SQL Server
instance. Main db (else SQL Server won't work !)
MSdb : Stores info regarding Backups, SQL Agent info, DTS
packages, SQL Server jobs, replication info for log
shipping.
Tempdb : To hold temp objects like global & local temp
tables, sps
Model db: Used in creation of any new database within the
SQL Server instance of which it(model) is a part.
how to delete duplicate rows from table
Name : New
Sno
1
2
3
4
5
6

Name
Rajesh
Rajesh
Raja
Raja
Arun
Bala

Delete From New Where Sno Not IN
(Select Min(Sno) From New Group By Name)
What is the diff b/n CTE and temp tables?
Temp tables are always on disk - so as long as your CTE can be held in memory, it
would most likely be faster (like a table variable, too).
But then again, if the data load of your CTE (or temp table variable) gets too big, it'll
be stored on disk, too, so there's no big benefit.
In general, I prefer a CTE over a temp table since it's gone after I used it. I don't
need to think about dropping it explicitly or anything.
A traditional temp table stores the data in the temp DB, which does slow down the
temp tables; however table variables do not.
Diff b/n @@identity and scope_identity

@@IDENTITY, SCOPE_IDENTITY, and IDENT_CURRENT are similar functions
because they all return the last value inserted into the IDENTITY column of a table.
@@IDENTITY and SCOPE_IDENTITY return the last identity value generated in any
table in the current session. However, SCOPE_IDENTITY returns the value only within
the current scope; @@IDENTITY is not limited to a specific scope.
It took me a minute to find a good example to illustrate the difference between the two,
but with a trigger I created the following example:
USE TempDB
GO
CREATE TABLE tst
( a int identity(1,1), s varchar(10))
GO
CREATE TABLE tst2
( a int identity(1000,1), s varchar(10))
GO
CREATE TRIGGER dbo.trgTst
ON tst
AFTER INSERT
AS INSERT tst2 SELECT inserted.s FROM inserted
GO
INSERT tst VALUES('a')
SELECT
@@IDENTITY AS [@@IDENTITY],
SCOPE_IDENTITY() AS [SCOPE_IDENTITY()]
GO
DROP TABLE tst2
DROP TABLE tst
SCOPE_IDENTITY() will give you the value of tst.$IDENTITY, ignoring the identity
value that is generated for table tst2 with the trigger after the insert into tst.
@@IDENTITY will give you that value from tst2.$IDENTITY.
So the function we actually need in our case is not @@IDENTITY but
SCOPE_IDENTITY().
@@IDENTITY
It returns the last IDENTITY value produced on a connection, regardless of the table that
produced the value, and regardless of the scope of the statement that produced the value.
@@IDENTITY will return the last identity value entered into a table in your current

session. While @@IDENTITY is limited to the current session, it is not limited to the
current scope. If you have a trigger on a table that causes an identity to be created in
another table, you will get the identity that was created last, even if it was the trigger that
created it.
SCOPE_IDENTITY()
It returns the last IDENTITY value produced on a connection and by a statement in the
same
scope,
regardless
of
the
table
that
produced
the
value.
SCOPE_IDENTITY(), like @@IDENTITY, will return the last identity value created in
the current session, but it will also limit it to your current scope as well. In other words, it
will return the last identity value that you explicitly created, rather than any identity that
was created by a trigger or a user defined function.
IDENT_CURRENT(‘tablename’)
It returns the last IDENTITY value produced in a table, regardless of the connection that
created the value, and regardless of the scope of the statement that produced the value.
IDENT_CURRENT is not limited by scope and session; it is limited to a specified table.
IDENT_CURRENT returns the identity value generated for a specific table in any
session and any scope.
To avoid the potential problems associated with adding a trigger later on, always use
SCOPE_IDENTITY() to return the identity of the recently added row in your T SQL
Statement or Stored Procedure.

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close