XSD String Data Types

Welcome to The Coding College, where we simplify coding concepts! In this article, we’ll explore the string data types in XML Schema Definition (XSD) and how to use them effectively to validate your XML content.

What Are XSD String Data Types?

In XSD, string data types are used to define text-based elements and attributes. These types allow you to specify the format, length, and constraints for textual content in your XML documents.

Primary String Data Types in XSD

Data TypeDescription
xs:stringRepresents a standard string.
xs:normalizedStringA string that replaces line breaks, tabs, and carriage returns with spaces.
xs:tokenA string that normalizes whitespace (removes leading, trailing, and collapses multiple spaces).
xs:languageA string that conforms to language codes (e.g., “en”, “fr”).
xs:NameRepresents a valid XML name (no spaces, must start with a letter or underscore).
xs:NCNameRepresents a non-colonized name (no spaces or colons, must start with a letter or underscore).
xs:IDA unique ID value (used for identifying elements).
xs:IDREFA reference to an xs:ID.
xs:IDREFSA list of one or more xs:IDREF values, separated by spaces.

XSD String Data Type Examples

1. Using xs:string

This is the most basic string data type, allowing any text content.

XSD Schema

<xs:element name="message" type="xs:string"/>

Valid XML

<message>Hello, World!</message>

2. Using xs:normalizedString

This type replaces tabs, newlines, and carriage returns with spaces.

XSD Schema

<xs:element name="note" type="xs:normalizedString"/>

Valid XML

<note> This    is   a   note. </note>

The output is treated as:

This is a note.

3. Using xs:token

This type removes leading and trailing spaces and collapses multiple spaces into a single space.

XSD Schema

<xs:element name="keyword" type="xs:token"/>

Valid XML

<keyword>   XML     Schema   Definition   </keyword>

The output is treated as:

XML Schema Definition

4. Using xs:language

This type validates content against standard language codes (e.g., “en”, “fr”, “es”).

XSD Schema

<xs:element name="language" type="xs:language"/>

Valid XML

<language>en</language>

5. Using xs:Name

This type ensures the content is a valid XML name (e.g., starts with a letter or _, and contains no spaces).

XSD Schema

<xs:element name="username" type="xs:Name"/>

Valid XML

<username>_userName123</username>

6. Using xs:ID and xs:IDREF

These types enforce unique IDs and references within the XML document.

XSD Schema

<xs:element name="book">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="title" type="xs:string"/>
      <xs:element name="authorID" type="xs:ID"/>
      <xs:element name="authorRef" type="xs:IDREF"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

Valid XML

<book>
  <title>Learning XML</title>
  <authorID>auth001</authorID>
  <authorRef>auth001</authorRef>
</book>

String Data Type Facets

You can add facets to string data types to impose restrictions, such as length, patterns, or enumeration.

Common Facets for Strings

FacetDescription
lengthSpecifies the exact length of the string.
minLengthSpecifies the minimum length of the string.
maxLengthSpecifies the maximum length of the string.
patternSpecifies a regular expression that the string must match.
enumerationRestricts the string to specific predefined values.

Examples with Facets

1. Restricting String Length

XSD Schema

<xs:element name="username">
  <xs:simpleType>
    <xs:restriction base="xs:string">
      <xs:minLength value="5"/>
      <xs:maxLength value="15"/>
    </xs:restriction>
  </xs:simpleType>
</xs:element>

Valid XML

<username>JohnDoe123</username>

Invalid XML

<username>JD</username> <!-- Too short -->

2. Using Pattern

XSD Schema

<xs:element name="email">
  <xs:simpleType>
    <xs:restriction base="xs:string">
      <xs:pattern value="[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}"/>
    </xs:restriction>
  </xs:simpleType>
</xs:element>

Valid XML

<email>[email protected]</email>

Invalid XML

<email>user@com</email> <!-- Invalid email format -->

3. Using Enumeration

XSD Schema

<xs:element name="gender">
  <xs:simpleType>
    <xs:restriction base="xs:string">
      <xs:enumeration value="Male"/>
      <xs:enumeration value="Female"/>
      <xs:enumeration value="Other"/>
    </xs:restriction>
  </xs:simpleType>
</xs:element>

Valid XML

<gender>Male</gender>

Invalid XML

<gender>Unknown</gender> <!-- Not in the enumeration list -->

When to Use String Data Types?

  • Use xs:string for general text content.
  • Use xs:normalizedString or xs:token for cleaned-up text.
  • Use xs:language, xs:Name, xs:ID, and xs:IDREF for specific use cases requiring unique IDs, valid names, or language codes.

Conclusion

XSD string data types are versatile and powerful tools for defining and validating textual content in XML documents. With the use of facets and specific string types, you can enforce precise rules and constraints for your XML data.

Keep exploring XML and related technologies with The Coding College—your trusted resource for coding tutorials!

Leave a Comment