by Michele Laurelli
Parameter controlling randomness in text generation by scaling logits before softmax.
Temperature T divides logits. T < 1 makes distribution sharper (deterministic), T > 1 flatter (random). T = 0 is greedy, T = 1 unmodified.
Temperature 0.7 for creative text
Temperature 0.1 for factual
Greedy decoding T=0