MLEM + Modal + nanoGPT blog code

I am reading the blog - MLEM + Modal + nanoGPT
I could not go past “!python train.py config/train_mlemai.py --device=cuda --dtype=float32 --max_iters=3000 --init_from=scratch” running on Google Colab with GPU set.
Error message is as attached.

Overriding config with config/train_mlemai.py:

train a miniature character-level shakespeare model

good for debugging and playing on macbooks and such

out_dir = ‘out-mlemai-char’
eval_interval = 250 # keep frequent because we’ll overfit
eval_iters = 200
log_interval = 10 # don’t print too too often

we expect to overfit on this small dataset, so only save when val improves

always_save_checkpoint = False

wandb_log = False # override via command line if you like

dataset = ‘mlem-docs’
batch_size = 64
block_size = 256 # context of up to 128 previous characters

baby GPT model :slight_smile:

n_layer = 6
n_head = 6
n_embd = 384
dropout = 0.2

learning_rate = 1e-3 # with baby networks can afford to go a bit higher
max_iters = 5000
lr_decay_iters = 5000 # make equal to max_iters usually
min_lr = 1e-4 # learning_rate / 10 usually
beta2 = 0.99 # make a bit bigger because number of tokens per iter is small

warmup_iters = 100 # not super necessary potentially

on macbook also add

device = ‘mps’ # run on cpu only
compile = False # do not torch compile the model

Overriding: device = cuda
Overriding: dtype = float32
Overriding: max_iters = 3000
Overriding: init_from = scratch
vocab_size = 153 (from data/mlem-docs/meta.pkl)
Initializing a new model from scratch
number of parameters: 10.80M
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [64,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [65,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [66,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [67,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [68,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [69,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [70,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [71,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [72,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [73,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [74,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [75,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [76,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [77,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [78,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [79,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [80,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [81,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [82,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [83,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [84,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [85,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [86,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [87,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [88,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [89,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [90,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [91,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [92,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [93,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [94,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [95,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [0,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [1,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [2,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [3,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [4,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [5,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [6,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [7,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [8,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [9,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [10,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [11,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [12,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [13,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [14,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [15,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [16,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [17,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [18,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [19,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [20,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [21,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [22,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [23,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [24,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [25,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [26,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [27,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [28,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [29,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [30,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [75,0,0], thread: [31,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [96,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [97,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [98,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [99,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [100,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [101,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [102,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [103,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [104,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [105,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [106,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [107,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [108,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [109,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [110,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [111,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [112,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [113,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [114,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [115,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [116,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [117,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [118,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [119,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [120,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [121,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [122,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [123,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [124,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [125,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [126,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [35,0,0], thread: [127,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [32,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [33,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [34,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [35,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [36,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [37,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [38,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [39,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [40,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [41,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [42,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [43,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [44,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [45,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [46,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [47,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [48,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [49,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [50,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [51,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [52,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [53,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [54,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [55,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [56,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [57,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [58,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [59,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [60,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [61,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [62,0,0] Assertion srcIndex < srcSelectDimSize failed.
…/aten/src/ATen/native/cuda/Indexing.cu:1146: indexSelectLargeIndex: block: [4,0,0], thread: [63,0,0] Assertion srcIndex < srcSelectDimSize failed.
Traceback (most recent call last):
File “/content/drive/MyDrive/Iteractive.ai/nanoGPT/train.py”, line 235, in
losses = estimate_loss()
File “/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py”, line 115, in decorate_context
return func(*args, **kwargs)
File “/content/drive/MyDrive/Iteractive.ai/nanoGPT/train.py”, line 196, in estimate_loss
logits, loss = model(X, Y)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py”, line 1501, in _call_impl
return forward_call(*args, **kwargs)
File “/content/drive/MyDrive/Iteractive.ai/nanoGPT/model.py”, line 139, in forward
x = block(x)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py”, line 1501, in _call_impl
return forward_call(*args, **kwargs)
File “/content/drive/MyDrive/Iteractive.ai/nanoGPT/model.py”, line 89, in forward
x = x + self.attn(self.ln_1(x))
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py”, line 1501, in _call_impl
return forward_call(*args, **kwargs)
File “/content/drive/MyDrive/Iteractive.ai/nanoGPT/model.py”, line 47, in forward
q, k ,v = self.c_attn(x).split(self.n_embd, dim=2)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py”, line 1501, in _call_impl
return forward_call(*args, **kwargs)
File “/usr/local/lib/python3.10/dist-packages/torch/nn/modules/linear.py”, line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)