Tuesday, March 29, 2005

Collation in sql server


The term, collation, refers to a set of rules that determine how
data is sorted and compared
. In Microsoft® SQL Server 2000, it
is not required to separately specify code page and sort order for
character data, and the collation used for Unicode data. Instead,
specify the collation name and sorting rules to use. Character data is
sorted using rules that define the correct character sequence, with
options for specifying case-sensitivity, accent marks, kana character
types, and character width. Microsoft SQL Server 2000 collations
include these groupings:


  • Windows collations - Windows collations define rules for storing
    character data based on the rules defined for an associated
    Windows locale. The base Windows collation rules specify which
    alphabet or language is used when dictionary sorting is applied,
    as well as the code page used to store non-Unicode character
    data. For Windows collations, the nchar, nvarchar,
    and ntext data types have the same sorting behavior as
    char, varchar, and text data types


  • SQL collations - SQL collations are provided for
    compatibility
    with sort orders in earlier versions of
    Microsoft SQL Server.


Sort Order

Binary is the fastest sorting order, and is case-sensitive. If
Binary is selected, the Case-sensitive,
Accent-sensitive, Kana-sensitive, and
Width-sensitive options are not available.



"all" width="90%" border="1">



























Sort order

Description

Binary

Sorts and compares data in Microsoft® SQL
Server™ tables based on the bit patterns defined for
each character. Binary sort order is case-sensitive, that
is lowercase precedes uppercase, and accent-sensitive.
This is the fastest sorting order.

If this option is not selected, SQL Server follows sorting
and comparison rules as defined in dictionaries for the
associated language or alphabet.

Case-sensitive

Specifies that SQL Server distinguish between uppercase
and lowercase letters.

If not selected, SQL Server considers the uppercase and
lowercase versions of letters to be equal. SQL Server does
not define whether lowercase letters sort lower or higher
in relation to uppercase letters when Case-sensitive is
not selected.

Accent-sensitive

Specifies that SQL Server distinguish between accented and
unaccented characters. For example, 'a' is not
equal to 'á'.

If not selected, SQL Server considers the accented and
unaccented versions of letters to be equal.

Kana-sensitive

Specifies that SQL Server distinguish between the two
types of Japanese kana characters: Hiragana and
Katakana.

If not selected, SQL Server considers Hiragana and
Katakana characters to be equal.

Width-sensitive

Specifies that SQL Server distinguish between a
single-byte character (half-width) and the same character
when represented as a double-byte character
(full-width).

If not selected, SQL Server considers the single-b



No comments: