Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/VrajPatel105/cpp-gpu-inference/llms.txt

Use this file to discover all available pages before exploring further.

C++ systems programming begins with a tight grip on the language’s lowest-level primitives. Before writing a single CUDA kernel or allocating a GPU buffer, you need to understand how C++ models memory through its type system, how it controls I/O, how arrays and loops work at the hardware level, and how to split code across compilation units using header files. This page walks through each of those fundamentals using the small, focused programs from the 1. cpp core module of this project — the Week 1–2 foundation work that every GPU and transformer concept later in the course builds on.

I/O: cin, cout, and getline

C++ I/O uses stream objects from <iostream>. cout writes to stdout, cin reads a single whitespace-delimited token, and getline reads an entire line including spaces.
#include <iostream>
#include <string>
using namespace std;

int main(){
    string my_fav_color;
    cout << "Enter your favourite color Vraj :) " << endl;
    getline(cin, my_fav_color);
    cout << "Vraj fav color is " << my_fav_color << endl;
    return 0;
}
getline is required whenever the input may contain spaces. Using cin >> my_fav_color would stop reading at the first space character.

Integer Types and Sizes

C++ guarantees minimum bit widths for its primitive types, but the exact size is platform-dependent. Use sizeof (multiplied by 8) to inspect the actual bit width on your machine. This matters in GPU kernel code where tensor element widths — float16, int8, int32 — must match exactly.
#include <iostream>
#include <cstdint>
using namespace std;

int main(){
    // 1 byte is of 8 bits
    printf("size of int data type is %ld bits \n ", sizeof(int) * 8);
    printf("size of char data type is %ld bits \n ", sizeof(char) * 8);
    printf("size of short int data type is %ld bits \n ", sizeof(short int) * 8);
    printf("size of long int data type is %ld bits \n ", sizeof(long int) * 8);
    printf("size of long long int data type is %ld bits \n ", sizeof(long long int) * 8);
    // output :
    // size of int data type is 32 bits
    // size of char data type is 8 bits
    // size of short int data type is 16 bits
    // size of long int data type is 32 bits
    // size of long long int data type is 64 bits
    return 0;
}

Fixed-Width Types

Prefer <cstdint> types like int8_t, int32_t, and uint64_t when the bit width must be exact — essential for matching CUDA tensor element types.

Platform Variation

long int is 32 bits on Windows (MSVC/MinGW) but 64 bits on Linux/macOS. long long int is reliably 64 bits everywhere.

Arrays and Iteration

C-style arrays allocate a contiguous block of memory on the stack. They are the conceptual ancestor of GPU device buffers: a flat sequence of elements at a known base address. Elements not explicitly initialized hold garbage values.
#include <iostream>
using namespace std;

int main(){
    int my_array[5] = {3, 54, 23, 54, 435};
    cout << my_array[0] << endl;  // 3

    int another_array[5];
    another_array[0] = 9;

    cout << another_array[0] << endl;  // 9
    cout << another_array[1] << endl;  // garbage value
}

Looping Over Arrays

C++ provides three loop forms for arrays. The range-based for loop (C++11 and later) is the most readable for sequential access.
#include <iostream>
using namespace std;

int main(){
    int myarray[10] = {1, 3, 4, 5, 6, 7, 4, 3, 34, 12};
    int n = size(myarray);

    // index-based for loop
    for(int i = 0; i < n; i++){
        cout << myarray[i] << endl;
    }

    // range-based for loop (C++11)
    for(int i : myarray){
        cout << i << endl;
    }

    // while loop
    int i = 0;
    while(i < 7){
        cout << myarray[i] << endl;
        i++;
    }
}
Prefer the range-based for(int i : myarray) syntax when you only need the values and do not need the index. It eliminates off-by-one errors and reads closer to Python’s for x in array.

Functions

Functions in C++ must be declared before they are called (or forward-declared). A void function returns nothing; any other return type must match the return expression. Parameters are passed by value by default — the function receives a copy.
#include <iostream>
using namespace std;

void sayHello(){
    puts("Hello vraj");
}

int addNums(int a, int b){
    int sum = a + b;
    return sum;
}

int main(){
    sayHello();

    int x = 3;
    int y = 5;

    int answer = addNums(x, y);
    printf("our final answer %d", answer);
    return 0;
}
1

Declare or define before use

Place function definitions above main, or use a forward declaration (int addNums(int a, int b);) at the top of the file.
2

Match return type

The return type in the signature must match what the function actually returns. Returning nothing from a non-void function is undefined behavior.
3

Pass by reference for large objects

For large structs or vectors, pass by const& to avoid copying the entire object. This becomes critical for tensor-sized allocations in ML code.

Header Files and Include Guards

When a project grows beyond one file, declarations move into header files (.h) and definitions stay in .cpp files. The #ifndef / #define / #endif pattern — the include guard — prevents the header from being processed more than once if included by multiple translation units.
#ifndef adder_h
#define adder_h

int addNum(int a, int b){
    return a + b;
}

#endif
Modern C++ projects often use #pragma once as a simpler alternative to manual include guards. Both prevent double-inclusion; #pragma once is a compiler extension supported by GCC, Clang, and MSVC.

Structs and const Fields

A struct groups related data under a single type. C++ distinguishes between const int (the integer value is immutable) and const char* (the pointer points to immutable data, but the pointer itself can be reassigned). This distinction appears frequently in ML code when labeling or describing tensor metadata.
#include <iostream>
using namespace std;

struct User{
    const int uId;       // the integer value is const
    const char *name;    // pointer to const data (pointer itself is mutable)
    const char *email;
    int course_count;
};

int main(){
    User vraj = {001, "VrajPatel", "vraj@gmail.com", 2};
    User abc  = {002, "apatel",   "a@gmail.com",    3};
    return 0;
}

const int uId

The integer value itself cannot be changed after construction. Attempting vraj.uId = 999 will not compile.

const char* name

The characters the pointer references are const, but name can be pointed at a different string literal. Use char* const name to make the pointer itself immutable.

Compiling Single Files

Each source file in the 1. cpp core module is a standalone program. Compile and run any single file directly with g++:
# Compile and run a single file
g++ -std=c++20 basic.cpp -o basic
./basic

# Compile with all warnings enabled (recommended)
g++ -std=c++20 -Wall -Wextra basic.cpp -o basic
./basic
# Examples for each concept file
g++ -std=c++20 integers.cpp    -o integers    && ./integers
g++ -std=c++20 arrays.cpp      -o arrays      && ./arrays
g++ -std=c++20 iteration.cpp   -o iteration   && ./iteration
g++ -std=c++20 functions.cpp   -o functions   && ./functions
g++ -std=c++20 header.cpp      -o header      && ./header
g++ -std=c++20 struct.cpp      -o struct_demo && ./struct_demo
For multi-file builds using CMake, see the Build System page.

Build docs developers (and LLMs) love