The Three Types
Go represents text using three related but distinct types:
string Read-only sequence of bytes. Typically UTF-8 encoded text.
rune A Unicode code point. Alias for int32. Represents a single character.
byte A single byte. Alias for uint8. The building block of strings.
Strings
A string is a read-only slice of bytes . In Go, strings are typically UTF-8 encoded:
str := "hey"
// Strings are immutable - you cannot modify them
// str[0] = 'H' // ❌ Compile error: cannot assign to str[0]
// Get length in bytes
fmt . Println ( len ( str )) // 3
// Access individual bytes
fmt . Printf ( " %c \n " , str [ 0 ]) // 'h'
fmt . Printf ( " %c \n " , str [ 1 ]) // 'e'
fmt . Printf ( " %c \n " , str [ 2 ]) // 'y'
len() returns the number of bytes , not the number of characters. For Unicode strings, these can differ!
Unicode and Multi-Byte Characters
str := "Yūgen ☯ 💀"
fmt . Println ( len ( str )) // 17 bytes (not 10 characters!)
// To get the number of characters (runes)
import " unicode/utf8 "
runeCount := utf8 . RuneCountInString ( str )
fmt . Println ( runeCount ) // 10 runes
Bytes
Bytes are the raw building blocks. A byte is an alias for uint8:
// String to bytes
str := "hey"
bytes := [] byte ( str )
fmt . Printf ( " %d \n " , bytes ) // [104 101 121]
// Bytes to string
bytes := [] byte { 104 , 101 , 121 }
str := string ( bytes )
fmt . Println ( str ) // "hey"
Working with Byte Slices
str := "Yūgen ☯ 💀"
// Convert to bytes
bytes := [] byte ( str )
// You CAN modify byte slices (unlike strings)
bytes [ 0 ] = ' N '
bytes [ 1 ] = ' o '
// Convert back to string
str = string ( bytes )
fmt . Println ( str ) // "Nogen ☯ 💀"
Byte Representation
bytes := [] byte ( "Yūgen ☯ 💀" )
fmt . Printf ( " % x \n " , bytes )
// Output (in hex): 59 c5 ab 67 65 6e 20 e2 98 af 20 f0 9f 92 80
// Note: Some characters take multiple bytes:
// 'Y' = 59 (1 byte)
// 'ū' = c5 ab (2 bytes)
// '☯' = e2 98 af (3 bytes)
// '💀' = f0 9f 92 80 (4 bytes)
Runes
A rune represents a Unicode code point. Rune is an alias for int32:
// Rune literals use single quotes
var r rune = ' h '
fmt . Printf ( " %c = %d \n " , r , r ) // h = 104
Rune Literals are Typeless
Rune literals can be assigned to any numeric type:
var anInt int = ' h '
var anInt8 int8 = ' h '
var anInt16 int16 = ' h '
var anInt32 int32 = ' h '
var aRune rune = ' h ' // rune is alias for int32
fmt . Printf ( " %T %T %T %T %T \n " , anInt , anInt8 , anInt16 , anInt32 , aRune )
// Output: int int8 int16 int32 int32
Rune Representations
The same rune can be written in different formats:
// All represent the same rune 'h'
fmt . Printf ( " %q in decimal: %[1]d \n " , 104 ) // 'h' in decimal: 104
fmt . Printf ( " %q in binary : %08[1]b \n " , ' h ' ) // 'h' in binary : 01101000
fmt . Printf ( " %q in hex : 0x %[1]x \n " , 0x 68 ) // 'h' in hex : 0x68
String to Runes
Convert strings to rune slices to work with individual characters:
str := "Yūgen ☯ 💀"
// Convert to runes
runes := [] rune ( str )
fmt . Println ( len ( runes )) // 10 (number of characters)
fmt . Printf ( " %c \n " , runes [ 0 ]) // 'Y'
fmt . Printf ( " %c \n " , runes [ 1 ]) // 'ū'
fmt . Printf ( " %c \n " , runes [ 6 ]) // '☯'
fmt . Printf ( " %c \n " , runes [ 8 ]) // '💀'
// Access multi-character sequences
fmt . Printf ( " %c \n " , runes [: 5 ]) // [Y ū g e n]
Trade-off : Rune slices use 4 bytes per character (fixed size), while UTF-8 strings use 1-4 bytes per character (variable size). Rune slices offer easy indexing; strings save memory.
Converting Between Types
str := "hey"
// String ↔ Bytes
bytes := [] byte ( str )
str = string ( bytes )
// String ↔ Runes
runes := [] rune ( str )
str = string ( runes )
// Character codes
fmt . Printf ( " %c : %d \n " , ' h ' , ' h ' ) // h : 104
fmt . Printf ( " %c : %d \n " , ' e ' , ' e ' ) // e : 101
fmt . Printf ( " %c : %d \n " , ' y ' , ' y ' ) // y : 121
Indexing Strings
Byte Indexing
Direct indexing gives you bytes, not characters:
str := "Yūgen ☯ 💀"
// Accessing by index gives bytes
fmt . Printf ( "1st byte: %c \n " , str [ 0 ]) // 'Y' - works (1 byte char)
fmt . Printf ( "2nd byte: %c \n " , str [ 1 ]) // 'Å' - wrong! (mid-character)
// Correct: use slicing for multi-byte characters
fmt . Printf ( "2nd rune: %s \n " , str [ 1 : 3 ]) // "ū" (2 bytes)
Safe Character Access
For multi-byte characters, use rune slices or iteration:
word := "öykü"
// Wrong: byte indexing
fmt . Printf ( " %c \n " , word [ 0 ]) // 'Ã' - wrong (first byte of ö)
// Correct: rune conversion
runes := [] rune ( word )
fmt . Printf ( " %c \n " , runes [ 0 ]) // 'ö' - correct
// Or use slicing with correct byte positions
fmt . Printf ( " %s \n " , word [: 2 ]) // "ö" (2 bytes)
fmt . Printf ( " %c \n " , word [ 2 ]) // 'y' (1 byte)
fmt . Printf ( " %c \n " , word [ 3 ]) // 'k' (1 byte)
fmt . Printf ( " %s \n " , word [ 4 :]) // "ü" (2 bytes)
Iterating Over Strings
Range Loop (Automatic Rune Decoding)
The for range loop automatically decodes runes:
str := "Yūgen ☯ 💀"
for i , r := range str {
fmt . Printf ( "str[ %2d ] = %q \n " , i , r )
}
// Output:
// str[ 0] = 'Y'
// str[ 1] = 'ū'
// str[ 3] = 'g'
// str[ 4] = 'e'
// str[ 5] = 'n'
// str[ 6] = ' '
// str[ 7] = '☯'
// str[10] = ' '
// str[11] = '💀'
Notice the index jumps! The index is the byte position , not the rune position.
Manual Rune Decoding
Use utf8.DecodeRuneInString() for manual control:
import " unicode/utf8 "
str := "öykü"
// Decode first rune
r , size := utf8 . DecodeRuneInString ( str )
fmt . Printf ( "rune: %c size: %d bytes \n " , r , size )
// Output: rune: ö size: 2 bytes
// Decode all runes manually
for i := 0 ; i < len ( str ); {
r , size := utf8 . DecodeRuneInString ( str [ i :])
fmt . Printf ( " %c " , r )
i += size
}
fmt . Println ()
Practical Examples
Example 1: Character Counting
text := "Hello, 世界"
// Byte count
fmt . Printf ( " %d bytes \n " , len ( text )) // 13 bytes
// Rune (character) count
runeCount := utf8 . RuneCountInString ( text )
fmt . Printf ( " %d runes \n " , runeCount ) // 9 characters
Example 2: String Manipulation
word := "öykü"
// Convert to runes for easy manipulation
runes := [] rune ( word )
// Uppercase first letter (conceptual - use unicode package for real uppercasing)
runes [ 0 ] = ' Ö '
// Convert back to string
result := string ( runes )
fmt . Println ( result ) // "Öykü"
Example 3: Checking for Turkish Characters
func hasTurkishChars ( s string ) bool {
turkishChars := "çÇğĞıİöÖşŞüÜ"
for _ , r := range s {
if strings . ContainsRune ( turkishChars , r ) {
return true
}
}
return false
}
fmt . Println ( hasTurkishChars ( "hello" )) // false
fmt . Println ( hasTurkishChars ( "öykü" )) // true
Memory Comparison
import " unsafe "
str := "Yūgen ☯ 💀"
// As string (UTF-8): variable bytes per character
bytes := [] byte ( str )
fmt . Printf ( "String: %d bytes \n " , len ( bytes )) // 17 bytes
// As runes: 4 bytes per character (fixed)
runes := [] rune ( str )
runeBytes := int ( unsafe . Sizeof ( runes [ 0 ])) * len ( runes )
fmt . Printf ( "Runes : %d bytes \n " , runeBytes ) // 40 bytes (10 runes × 4)
Use strings (UTF-8) when:
Memory efficiency is important
Working with mostly ASCII text
Reading/writing data
Use rune slices when:
Need random access to characters
Character-by-character manipulation
Character counting is critical
Common String Operations
Length Operations
str := "Yūgen ☯ 💀"
// Byte length
fmt . Println ( len ( str )) // 17
// Rune (character) length
fmt . Println ( utf8 . RuneCountInString ( str )) // 10
// On byte slice
bytes := [] byte ( str )
fmt . Println ( len ( bytes )) // 17 (bytes)
fmt . Println ( utf8 . RuneCount ( bytes )) // 10 (runes)
String Building
For efficient string building, use strings.Builder:
import " strings "
var builder strings . Builder
builder . WriteString ( "Hello" )
builder . WriteString ( " " )
builder . WriteString ( "World" )
result := builder . String ()
fmt . Println ( result ) // "Hello World"
Key Takeaways
You cannot modify a string directly. Convert to []byte or []rune for modifications.
Go strings are UTF-8 encoded by default. Characters can take 1-4 bytes.
len(string) returns the number of bytes, not characters. Use utf8.RuneCountInString() for character count.
for range over a string automatically decodes UTF-8 runes, but gives byte positions as index.
string: immutable, UTF-8, memory-efficient
[]byte: mutable, raw bytes, 1 byte each
[]rune: mutable, Unicode code points, 4 bytes each
See Also