What is GitHub Copilot?
“30% of newly written code (in some languages) — is written/suggested by GitHub Copilot” was the newest catchy story from an interview with an insider from GitHub on Axios.com,
But what exactly is GitHub Copilot capable of, what are its use-cases, and what are its worries and opportunities?
GitHub Copilot is a developer assist tool, currently in technical preview since June, dubbed as “Your AI pair programmer”. Its landing page is accessible at https://copilot.github.com
As it advertises, it is “More than autocomplete” as it’s able to suggest multiple lines of code, based on just a comment or a fragment of code. Its use empowers developers with the second pair of hands, a copilot.
Copilot, is based on OpenAI Codex, an AI system that’s supposed to generate code from text commands, a descendant of GPT-3, and is more powerful (14KB vs 4KB memory). In its demo video, it shows a small JS game being made just by input like “When clicked make rocket move at 4x speed for 0.25s”.
OpenAI is a for-profit + non-profit (“capped profit”) company known for GPT-3 language model and many other ML models. It’s famously co-founded (as a non-profit) by Elon Musk. Now after the transition in 2019 – https://tcrn.ch/2u4k6ie to a for-profit “with limited profit” its preferred partner is Microsoft and its research is not open-source. (GitHub is now a subsidiary of Microsoft). Its shift is justified by AI research being heavily capital-intensive. Enough about OpenAI Codex, this article is about GitHub Copilot.
Using GitHub Copilot
GitHub Copilot usage is certainly not ideal, and it won’t “replace programmers” any time soon, but it’s pretty powerful and handy, and I’ve found myself yearning for a normal Visual Studio, plugin, that would enable me to use it with my regular C# workflow. Not long ago it’s been introduced to JetBrains IDEs and Neovim!
To walk you through an example:
1. I create a Python file in Visual Studio Code
2. Start with a comment:
“`# Download weather information for Warsaw, Poland from “
and it already suggested me – OpenWeatherMap
3. I’m not happy with its next-line suggestion so I hover over it and get 4 more suggestions, (as I’m lazy) – I go with # and print it to stdout; I later go on with import requests and json modules
4. Enter “def” – and it suggests me name get_weather(city) along with code
5. I enter my API key which I’ve just created and run the code (btw, it works)
Sounds like a dream-come-true? Yeah, it partially is, but, the code is not perfect, even here, it suggested I use an unsafe but common practice of entering API key inside the code (+ when I went through other suggestions it actually gave me a non-working API key). Also, printing inside the function and not including a country into the request is not a great solution either (which can also be changed by going with another suggestion)
So, as shown here, it can’t be actually used as a fully-privileged Copilot, more like a Junior Copilot, or an Apprentice Pilot which can be given a small, dumb task, and has to be corrected. Sometimes it can even suggest libraries that don’t even exist.
From personal experience with working on a larger codebase, it often suggests things that don’t make sense, but, it’s good enough for writing down repeatable things, and is good enough for making a quick test, before quickly changing the code based on learned things from Copilot’s suggestions.
A great article explaining tips for using Copilot can be found here: Tips for using GitHub Copilot in everyday Python programming – basically – “make your code readable” to improve quality of suggestions, “use type hints”, try next suggestions, and you can expect it to know most known libraries.
Problems
So, when it comes to problems with Copilot first that comes to mind is security. According to a study as quoted by David Ramel on VS Magazine, 40% of its suggested programs had vulnerabilities.
Also, it can contribute towards leaks of secrets like API keys of sloppy programmers.
Copyright infringement is also a worry. Not everything suggested is original, and the longer it is, the higher the change it’s remembering someone’s code.
https://twitter.com/mitsuhiko/status/1410886329924194309 as shown in a funny tweet, GitHub Copilot created a fake license for the famous – fast inverse square root code from Quake III Arena.
However each programmer uses StackOverflow anyways, further than that, as in Against Intellectual Property – Stephan Kinsella, IP could be considered as unjustifiable as it has no scarcity – an idea cannot be owned.
Future of AI in coding, future of Copilot.
If Copilot was to be priced against Tabnine (Link to price table) – an AI code assistant/completion – it could cost 10$ for regular, and 25$ for Enterprise, if it was to come outside of GitHub Pro subscription. However, it could cost even less for the standard version, as GitHub could aim for maximum adoption like it does with its GitHub subscription. However, running it must be more expensive than running Tabnine.
The story probably doesn’t end at code suggestions, to gain better code suggestions it will probably need to introduce a quality control model into its training. From that, they could offer that to users. Talking about quality control, we could see at someday Copilot or its another incarnation being used for refactoring existing code, paired with static analysis. Its context could be enhanced by introducing some simplified code overviews from other files, and it could load/be personalized for existing docs, libraries, to power its suggestions – it could even become a standard that each open-source library has code fragments for designed for code suggestions.
And of course, with time, it can get even more powerful, with bigger models, and better-designed models. As it seems to be a rule, even a new Moore’s law, that language models get 10x bigger every year – as suggested by Hugging Face. And photonic computers – like the one from Lightmatter – are only going to accelerate that.
MM