Here are rough notes (to self) on the copilot story. Source

  • knew that they wanted to build something using GPT-3
  • started prototyping: demo’s were fabulous
    • demo being good is not a sufficient condition
  • models were not good enough for chat interface - 25% answer that i love, 75% it was garbage
  • code synthesis - synthesizing large function calls - not that satisfying
  • small scale autocomplete with the large models -intellisense dropdown UI
    • UI was not the right thing
    • User would get multiple options for the function body - read and pick the right one
      • use the human feedback to improve the model
      • reasons this was bad
        • hit a key to request it
        • wait for it to come back
        • read three functions and click the right one - too much cognitive effort
          • result was that none of them were good or you didn’t know
        • lots of effort on the user but not a lot coming out of it
  • Alex said to use the cursor position in the AST to figure out where you are in the code
    • if you are at the beginning, complete the whole block
    • if you are in the middle, just complete one line
    • automatically generated with no user interaction
    • model was small enough to be low-latency but big enough to be accurate
    • only once all of these pieces were in place did the median new user loved copilot
  • other dead-ends too along the way.
  • it was obvious it was good - 100’s of users who were github engineers
  • retention numbers
    • 60%+ on 30 days
    • very intrusive product