M7: AI

  • First two layers become edge detectors —> combinations ofedges become corners —> if you combine corners with edges, then you have shapes —> shapes become objects
  • backpropagation is necessary “minimizing the loss function” b/c you can’t interdict half way through

Don’t beehive information

Token = 3/4 of a word

computational time: number of layers is exponential by layer

32 bits vs. 8 bits and the implications, it’s a 1/3 right? Look up CPU, GPU, and TPU

  • What is your plan to upgrade new LLMs?
  • Can you ask what data was this trained on?