Monthly Archives: October 2021

The Inevitable Rise of GitHub Copilot

Reading Time: 5 minutes

What is GitHub Copilot?

“30% of newly written code (in some languages) — is written/suggested by GitHub Copilot” was the newest catchy story from an interview with an insider from GitHub on Axios.com
But what exactly is GitHub Copilot capable of, what are its use-cases, and what are its worries and opportunities?

GitHub Copilot representative chart Source: GitHub
Source: GitHub

GitHub Copilot is a developer assist tool, currently in technical preview since June, dubbed as “Your AI pair programmer”. Its landing page is accessible at https://copilot.github.com
As it advertises, it is “More than autocomplete” as it’s able to suggest multiple lines of code, based on just a comment or a fragment of code. Its use empowers developers with the second pair of hands, a copilot.

Demo of GitHub Copilot from it's landing page
Source: GitHub

Copilot, is based on OpenAI Codex, an AI system that’s supposed to generate code from text commands, a descendant of GPT-3, and is more powerful (14KB vs 4KB memory). In its demo video, it shows a small JS game being made just by input like “When clicked make rocket move at 4x speed for 0.25s”.

OpenAI Codex demo - rocket
Source: OpenAI

OpenAI is a for-profit + non-profit (“capped profit”) company known for GPT-3 language model and many other ML models. It’s famously co-founded (as a non-profit) by Elon Musk. Now after the transition in 2019 – https://tcrn.ch/2u4k6ie to a for-profit “with limited profit” its preferred partner is Microsoft and its research is not open-source. (GitHub is now a subsidiary of Microsoft). Its shift is justified by AI research being heavily capital-intensive. Enough about OpenAI Codex, this article is about GitHub Copilot.

Using GitHub Copilot

GitHub Copilot usage is certainly not ideal, and it won’t “replace programmers” any time soon, but it’s pretty powerful and handy, and I’ve found myself yearning for a normal Visual Studio, plugin, that would enable me to use it with my regular C# workflow. Not long ago it’s been introduced to JetBrains IDEs and Neovim!

To walk you through an example:
1. I create a Python file in Visual Studio Code
2. Start with a comment:
“`# Download weather information for Warsaw, Poland from “
and it already suggested me – OpenWeatherMap
3. I’m not happy with its next-line suggestion so I hover over it and get 4 more suggestions, (as I’m lazy) – I go with # and print it to stdout; I later go on with import requests and json modules
4. Enter “def” – and it suggests me name get_weather(city) along with code
5. I enter my API key which I’ve just created and run the code (btw, it works)

Demo of code generated by me

Sounds like a dream-come-true? Yeah, it partially is, but, the code is not perfect, even here, it suggested I use an unsafe but common practice of entering API key inside the code (+ when I went through other suggestions it actually gave me a non-working API key). Also, printing inside the function and not including a country into the request is not a great solution either (which can also be changed by going with another suggestion)

So, as shown here, it can’t be actually used as a fully-privileged Copilot, more like a Junior Copilot, or an Apprentice Pilot which can be given a small, dumb task, and has to be corrected. Sometimes it can even suggest libraries that don’t even exist.

From personal experience with working on a larger codebase, it often suggests things that don’t make sense, but, it’s good enough for writing down repeatable things, and is good enough for making a quick test, before quickly changing the code based on learned things from Copilot’s suggestions.

A great article explaining tips for using Copilot can be found here: Tips for using GitHub Copilot in everyday Python programming – basically – “make your code readable” to improve quality of suggestions, “use type hints”, try next suggestions, and you can expect it to know most known libraries.

Problems

So, when it comes to problems with Copilot first that comes to mind is security. According to a study as quoted by David Ramel on VS Magazine, 40% of its suggested programs had vulnerabilities.

Copilot leaks
dtjm @ GitHub

Also, it can contribute towards leaks of secrets like API keys of sloppy programmers.
Copyright infringement is also a worry. Not everything suggested is original, and the longer it is, the higher the change it’s remembering someone’s code.
https://twitter.com/mitsuhiko/status/1410886329924194309 as shown in a funny tweet, GitHub Copilot created a fake license for the famous – fast inverse square root code from Quake III Arena.

However each programmer uses StackOverflow anyways, further than that, as in Against Intellectual Property  – Stephan Kinsella, IP could be considered as unjustifiable as it has no scarcity – an idea cannot be owned.

Future of AI in coding, future of Copilot.

Pricing of Tabnine
Pricing of Tabnine

If Copilot was to be priced against Tabnine (Link to price table) – an AI code assistant/completion – it could cost 10$ for regular, and 25$ for Enterprise, if it was to come outside of GitHub Pro subscription. However, it could cost even less for the standard version, as GitHub could aim for maximum adoption like it does with its GitHub subscription. However, running it must be more expensive than running Tabnine.

The story probably doesn’t end at code suggestions, to gain better code suggestions it will probably need to introduce a quality control model into its training. From that, they could offer that to users. Talking about quality control, we could see at someday Copilot or its another incarnation being used for refactoring existing code, paired with static analysis. Its context could be enhanced by introducing some simplified code overviews from other files, and it could load/be personalized for existing docs, libraries, to power its suggestions – it could even become a standard that each open-source library has code fragments for designed for code suggestions.

And of course, with time, it can get even more powerful, with bigger models, and better-designed models. As it seems to be a rule, even a new Moore’s law, that language models get 10x bigger every year – as suggested by Hugging Face. And photonic computers – like the one from Lightmatter – are only going to accelerate that.

Performance of lightmatter envise
lightmatter.co

MM

Is it still worth spending money on NASA?

Reading Time: 2 minutes

After Space Race ended, some wondered is it still worth spending money on NASA, how it is benefiting humanity, and if we should invest in space projects at all, if we have still so many problems back on earth.

But first let’s clear how much exactly NASA gets to complete their objectives. Many people believe it is about 20% of US whole budget, however in reality, in 2022 NASA’s proposed funding is $24.8 billion, which, even though it is more than in previous years, it is still less than 1% of US budget. But even with that, NASA still manages to do things, that would’ve been impossible if we stayed on earth.

Coming back on my previous point about spending money on space, while there are still many problems on earth; space programs can help people back on Earth. Good example is SMAP satelite which, while in space, scans the soil on the whole planet every 2 or 3 days, even through clouds, and sends information, which can be used to predict floods, droughts and other natural disasters harmful to crops. That is expecialy crucial for countries with high poverty level and unstable weather conditions like most of Africa. NASA has a data access policy of making all of the information and knowledge they gather free for anybody. That’s why it cannot be done by private companies with space projects like Spacex. Because companies need to get the money they spent back.

SMAP in space

Second thing worth mentioning is a possible extinction threat, like a meteor or other massive object striking the planet. The possibility is extreamly low, however the consequences are enough to make it a necessity to prepare countermeasures. The early meteorite alarm has been set in place, and there are plans to test if hitting a meteor in space with a ship could change it’s trajectory.

Considering that, I think it is still worth spending money on NASA and space projects. Going back on example with SMAP, while of course we could instead help by giving food for these in need, however it makes way more sense to choose a permanent solution, by giving them information about possible dangers to crops, which is their primary food source, and let these people help themselves instead of just helping them once, while spending roughly the same amount of funding.

Sources:

https://www.nasa.gov/careers/our-mission-and-values

https://www.nasa.gov/smap

https://www.space.com/biden-nasa-2022-budget-request-science-artemis

Tagged

Neuro-transformer GPT-3

Reading Time: 3 minutes

Nowadays, the most advanced neural network based on NLP (that is, text recognition algorithms) is GPT-3. This is a transformer neural network that is able to generate coherent responses in a dialogue with a person. The amount of data and parameters used by it is 100 times higher than the previous generation – GPT-2.

The GPT-3 neural network – Generative Pre-trained Transformer – was developed by the non-profit organization OpenAI, which was founded by the head of SpaceX, Elon Musk, and the ex-president of the YCombinator accelerator, Sam Altman. The third generation of the natural language processing program was presented to the public in May 2020. Today it is the most complex and voluminous language model of all existing ones.

However, even the most advanced transformers trained on huge amounts of data do not understand the meaning of the words and phrases they generate. Their training requires huge amounts of data and computing resources, which, in turn, leave a large carbon footprint. Another problem is the imperfection of datasets for training neural networks: texts on the Internet often contain distortions, manipulations and outright fakes.

One of the most promising directions in the development of AI and neural networks is the expansion of the range of perception. Now algorithms are able to recognize images, faces, fingerprints, sounds and voice. They are also able to speak and generate images and videos, imitating our perception of different senses. MIT scientists note that AI lacks emotional intelligence and feelings to get closer to a person. Unlike AI, a person is able not only to process information and issue ready-made solutions, but also to take into account the context, a variety of external and internal factors, and most importantly – to act in an uncertain and changing environment. For example, DeepMind’s AlphaGo algorithm is able to beat the world champion in go and chess, but still cannot expand its strategy beyond the board.

So far, even the most advanced algorithms, including GPT-3, are only on the way to this. Now the developers are faced with the task of creating multimodal systems that would combine text recognition and sensory perception to process information and find solutions.

What are the abilities of GPT-3?

New level T9

“I know that my brain is not a ‘feeling brain’. But it can make rational, logical decisions. I learned everything I know just by reading the Internet, and now I can write this column,” the GPT-3 neural network confided in its essay for The Guardian. The material published in September 2020 made a lot of noise. Even those who are far from technology are talking about the new algorithm.

Just like its predecessors – GPT-1 and GPT-2 – it is built on the transformer architecture. The main function of these neural networks is to predict the next word or part of it, focusing on the preceding ones. In fact, it calculates the connections between words and suggests the most likely sequence. The model works on the principle of auto-completion – almost like the T9 function in smartphones. Starting from one or two phrases, it can instantly generate text for several pages.

The way it was trained

GPT-3 differs from the two previous generations in the volume of datasets and the number of parameters — those variables that the algorithm optimizes during training. The first version of GPT, released in 2018, was trained on 5 GB of texts of Internet pages and books, and its size reached 117 million parameters. A year later, a more advanced GPT-2 appeared, already trained for 1.5 billion parameters and 40 GB of datasets.

But the third version of the algorithm beat the previous ones by a large margin. The number of parameters reached 175 billion, and the dataset size was 600 GB. It includes the entire English-language Wikipedia, books and poems, materials on media sites and GitHub, guidebooks and even recipes. Approximately 7% of the dataset was in foreign languages, so the language model can both generate texts of any format and translate them.

The algorithm was “fed” not only verified and confirmed data, but also texts whose reliability raises questions — for example, articles about conspiracy theories and pseudoscientific calculations. On the one hand, because of this, some of the generated texts contain incorrect information. On the other hand, thanks to this approach, the dataset turned out to be more diverse. And it reflects much more fully the information array that humanity has produced by 2020 than any scientific library.

The algorithm is fundamentally different from other artificial intelligence models. They are usually created for one purpose, for which all parameters and datasets are initially sharpened. GPT-3 is more flexible, it can be used to solve “almost any tasks” formulated in English. And instead of re-learning on additional data, it is enough to express the task in the form of a text query, description or examples.

All Homes of The Future Should (Not) Come With Wheels

Reading Time: 2 minutes

This article, based on tech and the environment, written by Parag Khanna, proposes that all homes of the future should come with wheels, and I am here to critically assess her point of view. In my opinion, I disagree with this notion especially with some of the claims being made in the article.

They state that trailer homes and RV’s are a “cost-effective, and sustainable alternative to traditional homeownership.” And although there may be some merit to this statement, she further claims that millennials, “witnessed the financial crisis demolish their parents’ house value,” which is why they are switching to mobile homes. This, in my opinion, is a ridiculous claim since for example, from 2006 to 2011 we saw an average of 80-100% growth in residential property value, and then nearly 50% from 2011 to 2017. Therefore this statement is arbitrary and only if these millenials choose not to believe house property value rises.

As for their statement that an RV is a sustainable alternative to traditional homes, this would only be true assuming you do not use the RV as a mobile, and instead have it sit in one area, as the majority of these RV’s have diesel engines that would absolutely ruin the environment. According to another article, Mara Johnson-Groh lived in her RV and did the calculations herself to see how much gas emissions she was producing. After driving, laundry, cooking, and accounting for what she ate, she was actually 1.2 U.S. tons above the global average of carbon dioxide emissions at 6.6 U.S. tons. This is primarily because driving her RV actually contributed to the production of 10000 additional pounds of carbon dioxide per year. Clearly not a “sustainable alternative” like the original article suggests. Now, of course, you can drive less, change to an electric engine, or do a myriad of other improvements, but the exact same can be said for a traditional home. You can make the same improvements to the traditional home to reduce your footprint such as solar panels, reduced waste, cold washing, etc.

In conclusion, I believe her position that all homes should have wheels is only based on assumptions, without facts to back up these claims of sustainability and adverseness to purchasing a home from living through a financial crisis. As I have assessed both of these claims and proven them to be false, it can be stated that her article’s point of view is baseless. Now I do not disagree with the idea of RVs as an alternative to a traditional home if it is done the right way in terms of sustainability. This would mean installing solar panels, adding LED lights and most importantly swapping out the engine for electric. Or if you simply do not use the sustainability argument as the reason behind choosing RVs over tradiitonal homes and instead choose an RV solely on cutting costs, for the mobility, and for the convenience, then it is a good alternative for sure. But with that being said, no, all future homes should not come with wheels.

https://www.popsci.com/technology/parag-khanna-move-book-excerpt/

https://www150.statcan.gc.ca/n1/daily-quotidien/130425/dq130425b-eng.htm

https://outsideonline.com/adventure-travel/essays/is-vanlife-ecofriendly