Fine-tuning GPT3

13 February 2023

AI, Apps

Fine-tuning GPT3

Reading Time: 3 minutes

…to improve performance in specific tasks and domains

GPT3 is a powerful language generation model, making it ideal for building chatbots, conversational interfaces, as well as other driven applications such as automated content creation. We can also use GPT3 to generate code, art and even to compose music. The main advantage in most cases is the ability to ask the right questions. While in more advanced projects, knowing how to use the GPT3 API to generate automated tools can greatly speed up our business or academic tasks.

This is one of the reasons why tuning GPT3 is so important. Doing this for specific tasks or domains involves training a model on a similar data set that is specific to the task or domain of interest. This process is also known as transfer learning and allows the model you are using for chat to adapt to new tasks by adjusting the weights of the free-training model to better fit the new data and emphasize a specific topic.

Take the example of tuning GPT3 for sentiment analysis. If we wanted to use chat for this we would train the model on a data set or text labeled with sentiment. Remembering that sentiment can be positive, negative or neutral. This is useful for, among other things, the work of a data analyst when he or she wants to find the sentiment of different tweets against, for example, a statement or the general character of a given politician.

GPT3 tuning can be done using various steps, and learning libraries such as TensorFlow or PyTorch. This can be done by adjusting the parameters of a pre-trained model using new data. This process can take anywhere from a few hours to a few days, depending on the size of the dataset and computational resources.

During a usual chat with ChatPT3, you may notice that the chat remembers what you asked it a few messages ago and is able to make changes based on its pre-generated statements. The chat also learns from our conversations. Tuning works in almost the same way, but on a much larger scale. Moving into the programming area, we can tune ChatGPT3 using the OpenAI API model, which includes GPT3 chat features.

The tuning process requires access to a dataset and a development environment to train the model, which is not directly provided in the GPT chat. So, to tune GPT3 you need to create an OpenAI API key and use it to evaluate the GPT3 model.

You can then use the API to tune the model on your specific task or domain by providing a dataset or using the API to train and update the model. You can always get pre-trained models that are tuned for specific domains or tasks, which are available from a number of different providers such as https://huggingface.co/models and more. You can use these models without training your own data set and simply add them as an add-on to GPT3 chat.

Another advanced tuning technique is data augmentation, which is a technique that is used to improve the performance of GPT3 models. This technique is used to artificially increase the size and variety of the training data. This can be done by using various techniques such as adding noise to the data, rotating and inverting the images and creating new data by combining existing and old data. This can help to make the model more robust and reduce overfitting.

For example, using data augmentation techniques artificially increases the size and diversity of the medical dataset and can help GPT3 learn medical-specific language and terminology. Transfer learning, on the other hand, allows the model to adapt more efficiently to a new task or domain. I strongly encourage you to experiment with ChatGPT as it can save us many hours of tedious work and improve the end result.

Source: OpenAI ChatGPT Master for Business and Software Applications

11 February 2023

Leave a comment

AI, Programming languages

When life gives you data, make an analysis!

Reading Time: 2 minutes

… or a day in the life of a data analyst

Data analysts, who are entrusted with converting raw data into insights that can be used to assist decision-making, are the unsung heroes of the data industry. Large amounts of data need to be gathered, cleaned and analyzed as part of their profession in order to find trends, patterns, and linkages that could otherwise go missed. In a time when data is king, data analysts are essential in assisting businesses in making sense of the massive volumes of data they produce every day.

A data analyst’s day can be varied and difficult, involving everything from gathering and cleaning data to examining and visualizing it to developing and testing predictive models. In addition to having a solid grasp of statistics, data visualization, and machine learning, data analysts must be able to concisely and clearly convey their findings to stakeholders. In order to comprehend the business context of the data and guarantee that their research meets the objectives of the stakeholders, they must also be able to work collaboratively with cross-functional teams that include engineers, product managers, and business analysts.

One of the most important tools in the data analyst’s arsenal is the programming language Python, which has become the de facto language for data analysis and data science. Python offers a wealth of libraries and tools that make it easy to perform data analysis tasks, such as collecting data, cleaning data, exploring data, and building predictive models.

Python, a programming language that has established itself as the standard for data analysis and data science, is one of the most crucial weapons in the toolbox of the data analyst. Python has an abundance of modules and tools that make it simple to carry out data analysis tasks like gathering data, cleaning data, examining data, and developing predictive models.

Here are some of the most common Python libraries used for data analysis:

Pandas: A fast, flexible, and powerful data analysis and manipulation library, used for tasks such as data cleaning, aggregation, and transformation.

Numpy: A library for numerical computing in Python, used for tasks such as linear algebra, random number generation, and array operations.

Matplotlib: A 2D plotting library for Python, used for tasks such as data visualization, histograms, and scatter plots.

Seaborn: A data visualization library based on Matplotlib, used for tasks such as regression plots, heatmaps, and violin plots.

Scikit-learn: An open-source library for machine learning in Python, providing a wide range of algorithms for classification, regression, clustering, and dimensionality reduction.

TensorFlow: A popular open-source platform for developing and training ML models, used for a wide range of tasks including image recognition, natural language processing, and time series forecasting.

PyTorch: An open-source ML framework for building and training deep learning models, used for tasks such as image classification, sequence analysis, and recommendation systems.

To conclude, data analysts are essential to helping businesses understand the enormous amounts of data they produce every day. To transform data into insights and encourage reasoned decision-making, they combine technical abilities, like Python programming and machine learning, with soft skills, like cooperation and communication. The world of data is an exciting and gratifying place to be, and there are endless opportunities for growth and development whether you are an experienced data analyst or just getting started.

Sources:

https://pandas.pydata.org/docs/

https://numpy.org/doc/stable/

https://matplotlib.org/stable/contents.html

https://seaborn.pydata.org/

https://scikit-learn.org/stable/

https://www.tensorflow.org/

https://pytorch.org/

28 December 2020

Leave a comment

Bez kategorii

DeepMind’s MuZero

Reading Time: 2 minutes

MuZerio is perceived as an important step forward in the searching for general-purpose algorithms. To put it simply, MuZero is a computer program which was developed by DeepMind company in order to master games and new artificial environments without knowing their rules.

MuZero is one of the newest solutions in the pursuit of methods that can not only learn a model which explains their environment, but also will be able to plan the best course of action. The program masters games like chess, Go, shogi and Atari without being told the rules in advance.

”MuZero really is discovering for itself how to build a model and understand it just from first principles.”

— David Silver, DeepMind, Wired

DeepMind over the last few years came out sequentially with AI programs: AlphaGo (2016), AlphaGo Zero (2017) and AlphaZero (2017). The thing which was common for all of them is that they got the rules of the games they had to master going into their training.

MuZero uses different techniques than its predecessors and therefore overcomes its limitations. The program doesn’t try to model the entire environment, instead it models just aspects that are crucial for AI in the decision-making process.

MuZero doesn’t rely on given knowledge of the environment’s dynamics, such as the rules of the game or an accurate simulator. This ability gives a hope that in the near future we will be able to apply this program to messy and complex real world problems.

Dr David Silver said that DeepMind was already using MuZero to try to develop a new kind of video compression, which could make a massive savings e.g. in data volume.

Moreover, its most advanced predecessor, AlphaZero, has been applied to a variety of complex problems in fields like chemistry, quantum physics and more.

Researchers have never been closer to developing a general-purpose algorithm – MuZero marks a new beginning in AI that can significantly accelerate and facilitate tackling real-world problems

which are typically hard to distill into simple rules. There is no doubt this technology will have a notable impact in tackling new challenges in robotics and industrial systems.

It seems like the ability to plan, allowing humans to generalise gathered experience to make predictions on new scenarios, will not be (sooner or later) the only human domain.

Sources:

https://www.bbc.com/news/technology-55403473

https://www.engadget.com/deepmind-muzero-160024950.html

https://deepmind.com/blog/article/muzero-mastering-go-chess-shogi-and-atari-without-rules

https://www.analyticsdrift.com/deepminds-muzero-marks-new-breakthrough-in-reinforcement-learning/

11 December 2020

1 Comment

Bez kategorii

Facebook buy or bury strategy

Reading Time: 2 minutes

During the Attorney General news conference of the antitrust lawsuit against Facebook (09/12/20) NY Attorney General Letitia James explained the potential harmful effect which could be caused by acquisitions made by Facebook. She announced that the acquisition of Instagram and WhatsApp reduced choices for consumers, restrained innovation and caused significant harm to the protection of privacy for millions of people. In order to impede competing services Facebook decided to implement a ‘buy or bury’ strategy — acquiring smaller or potential competitors before they could threaten the company’s dominance.

Instagram was bought by Facebook for $1 billion, which was a shocking sum given the fact that Instagram did not have even a dime of revenue at that time and that it had only thirteen employees. Today the company contributes over $20 billion to Facebook’s annual revenue.
Ian Conner, Director of the Federal Trade Commission’s Bureau of Competition, states that the FTC is aiming to inhibit or at least significantly slow down Facebook’s anticompetitive behaviour and rebuild competition with the objective of enabling free competition to thrive and innovation to develop.

Facebook defends itself claiming that all acquisitions in question were legal and cleared by regulatory agencies, highlighting that overturning them could be very dangerous and result in unpredictable consequences. There are also voices claiming that it could be simply too late to react on this matter, like the comment of Rep. Rep. Jerrold Nadler who said: “This should never have happened in the first place, and accountability is long overdue.”

Facebook is facing a situation where the FTC and over 40 states are seeking to break it up. The Facebook acquiring strategy is one of the new potential threats which modern society has to deal. It is as an effect of constant technology advancements and globalization. Many questions about privacy concerns cannot yet be answered and we can only try to predict potential results of actions taken by big tech companies.

Nevertheless, imposing a $5 billion fine on Facebook for mishandling user’s information by the FTC (2019) was severely criticised by not only the public, but also members of the agency’s board. We can assume that this explicit split is a result of not being aware of the potential consequences caused by creating a monopoly by a company which possesses vast amounts of our private data.

NY Attorney General news conference on antitrust lawsuit against Facebook:

The FTC is suing Facebook to unwind its acquisitions of Instagram and WhatsApp:
https://www.theverge.com/2020/12/9/22158483/facebook-antitrust-lawsuit-anti-competition-behavior-attorneys-general
Mark Zuckerberg bought Instagram as it was a ‘threat’ to Facebook:
https://www.business-standard.com/article/international/mark-zuckerberg-bought-instagram-as-it-was-a-threat-to-facebook-120073000324_1.html#:~:text=Facebook%20bought%20Instagram%20for%20%241,billion%20to%20Facebook’s%20annual%20revenue.
F.T.C. Approves Facebook Fine of About $5 Billion:

2 November 2020

Leave a comment

Bez kategorii

The time for technology to encroach on harder questions

Reading Time: 2 minutes

The greenfield of looking for and developing innovations is increasingly shrinking. Every tech solution which was easy to develop has already been developed. We have thousands of similar apps which have similar approaches to what mobile tools can give us.

The coming decade will require us to look for answers on more complicated questions, than how to create the next app which enables users to communicate with other users. We already have a multitude of them.

It’s time to contemplate questions linked with topics which consider building a better society, better planet and facilitating our life for example, by making us more empowered to have a deep and conscious work-life balance.

Vast amounts of data undoubtedly can help with this. Although, first we have to know which tools we should use to create a useful outcome from the data. This demands from us to find a way to network knowledge and to create something fundamental from chaos.

Massive crises like COVID-19 was a test for our society and authorities. When we look around all we can see is a significant breakdown. How can technology help us to improve it?

What can we do for the world overall to make it a better place? How can we connect people instead of continuously dividing them into contra groups? Where are threats and how can we weaken them as much as possible?

One of the most important problems technology is facing right now regards privacy concerns. People are worrying about what’s going on with their so-called data and, what’s even more terrifying, some of them are not aware of the harmful influence of social media advertising which is instantly hunting for their attention. When the attention is caught, it’s easy to manipulate the user’s opinion. The problem is described in “The Social Dilemma” directed by Jeff Orlowski. More regulations and restrictions are crucial.

Looking at the bright side, the innovations in educational purposes might well be in their golden age right now. Schools and teachers adapted to the new situation. Now we can see an intensifying move into supplemental part-time teaching. Platforms like “Outschool” provide the possibility to participate in small-group classes led by teachers on a broad range of topics.

“CEO Amir Nathoo estimates that teachers can make between $40 to $60 per hour, up from an average of $30 per hour in earnings in traditional public schools. Outschool itself has surged over 2,000% in new bookings, and recently turned its first profit.”

The platform is still gaining more full-time and part-time teachers and there are predictions that maybe in the future, these kinds of platforms will be somehow linked with the traditional ways of studying.

This will enable students to have more resources to facilitate their learning, especially now, when they can’t simply stay after class and ask their teacher for one further explanation.

https://techcrunch.com/2020/10/31/tech-trends-show-practical-solutions-are-coming-for-humanitys-real-world-problems/

Kozminski Techblog

A blog on technology, run by Kozminski University students and supervised by NeRDS

Author Archives: Rachańczyk Martyna

Fine-tuning GPT3

When life gives you data, make an analysis!

DeepMind’s MuZero

Facebook buy or bury strategy

The time for technology to encroach on harder questions