Large Language Models
A practical guide to ChatGPT & Co.
11.4.2023
Business Edition
This course is available for corporate use:
From CHF/USD 15.- / User!
Contact info@kanohi.ch
Navigating this course
Key | Function |
| Slides overview |
| Full screen mode |
| Jump to slide |
Course content
- What LLMs are
- How LLMs work
- Some LLM use cases
- How to use LLMs effectively
- The risks of using LLMs
1.1 What are Large Language Models?
1.2 What can LLMs do?
LLMs can be created for any kind of language!
(programming languages, protein sequences etc.)
In this course we discuss only applications for human language
1.3 Types of LLMs
Monomodal: | |
Multimodal: | |
1.4 Some LLM Examples
Name | Created by | Type | License |
GPT-3 | OpenAI | Monomodal | Commercial |
GPT-NeoX | EleutherAI | Monomodal | Open Source |
LLaMA | Meta | Monomodal | Research |
GPT-4 | OpenAI / Microsoft | Multimodal | Commercial |
1.5 Access Modes
Direct: | |
Indirect (via Application): | |
LLMs are „hidden“ in more and more software products!
2.1 How LLMs work
NOT a database with stored question / answer pairs!
Neural Network made from artificial neurons
2.1 How LLMs work 2
Artificial neuron = basic computation unit executing very simple calculations
Parameters = numbers stored in the neuron, which control the way it computes
(3,2,4 and 10 in the example above)
2.2 LLM Training
-
Mostly trained on very large amounts of data downloaded from the internet (=„corpus“)
-
Training task: find a missing word in a sentence (e.g.)
-
Example: „Meanwhile the ______ ran straight to the grandmother's house and knocked at the door.“ ➜ „wolf“
2.2 LLM Training 2
-
Size of parameter data ≪ size of training corpus!
-
➜ LLMs in almost never store training data „as is“
-
➜ The output of LLMs is almost always novel!
-
➜ But in rare cases they do output parts of their training data
2.3 LLM complexity (examples)
Name | Created by | #Param. | Training corpus |
GPT-3 | OpenAI | 175B | 499B tokens |
GPT-NeoX | EleutherAI | 20B | 825 GiB |
LLaMA | Meta | 65B | 1400B tokens |
GPT-4 | OpenAI / Microsoft |
? | ? |
3.1 Some use cases of LLMs: capabilities
- Summarization of texts
- Explain texts in simple terms
- Rewrite or shorten sentences
- Spell check / grammar check
- Translate into another language
- Text comparison and search
3.2 Some use cases of LLMs: tasks
- Write concepts
- Write cover letters, emails
- Write marketing texts (e.g. blog post)
- Write program code
- Create configuration files for software
- ...
4.1 How to use chatbots
- „Conversation“ = Series of questions and answers
- Chatbots remember what you have said before
- ➜ You can provide detailed specifications of the problem before asking your question
- ➜ You can ask for corrections, changes and improvements
4.2 Chatbots: Limitations
- Trained to decline inappropriate requests
- No knowledge of events after training date
- Currently limited math capabilities
- Usually can't tell where their knowlege comes from
- Sometimes produce boring output
- Sometimes challenged with complex tasks / problems
- Forget what was said, after a certain number of words
4.3 Chatbots: Boring output
- LLMs are trained on huge amounts of data
- ➜ Answers to simple questions correspond to an average of opinions / possible outputs.
- ➜ Often boring
Solution: provide detailed question!
4.4 TIP: how to solve complex problems
Let the bot solve the problem in multiple steps!
-
Ask the bot to structure the problem into subproblems first
-
Then ask the bot to solve the problem following the list it has created
➜ The chatbot is often able to solve the problem!
4.5 Chatbots: Very long conversations
Examples for the max. token / word count of LLMs:
Model | Tokens | Words |
ChatGPT | 4'096 | ∼3'000 |
GPT-4 | 32'768 | ∼25'000 |
➜ Older parts of the conversation are not considered for the answer!
4.6 TIP: how to have very long conversations
Let the bot summarize old parts!
-
Copy the old parts of the conversation
-
Let the chatbot summarize them. Copy the output
-
Include the summary before your latest input
➜ Answer considers old parts of the conversation!
5. LLMs: What are the risks?
- Inaccurate or wrong information
- Generic information given to many people
- Biased content generated
- Harmful content generated
- Breach of confidential information
- Loss of copyright protection of your work
- Output might be protected by copyright
- Patent becoming inadmissible
5.1 Inaccurate or wrong information
Database with verified content | |
Neural Network trained from internet content | |
➜ Mistakes | |
➜ Hallucinations | |
5.1 Inaccurate or wrong information
| Mistakes can be subtle! |
| Hallucinations might be convincing! |
5.2 Generic information given to many people
Same question ➜ same (or similar) answer!
➜ You end up using the same strategies as your competitors!
5.2 TIP: how to avoid generating generic output
Make your question unique!
-
Add boundary conditions, special requirements...
-
Describe the desired style of the output
➜ Interesting and unique outputs!
5.3 Biased content generated
- Age
- Look
- Gender
- Race
- Sexual orientation
5.3 TIP: how to correct biases in ouputs
-
Just ask the chatbot to fix it for you!
➜ In many cases the chatbox will improve the answer!
5.4 Harmful content generated
| Advice or encouragement for self harm behaviors |
| Graphic material (sexual, violent...) |
| Harassing, demeaning, and hateful content |
5.5 Breach of confidential information
-
Your question is transmitted to the server hosting the LLM! There it can be stored to a database
-
Your inputs might be used as training data for future versions of the LLM (and therefore later revealed to other users)
➜ Never enter confidential information!
5.5 TIP: how to avoid that your input is used as training data
Avoid giving training signals!
-
Do not click on links in the output (click = useful!)
-
Do not rate ("", "") the output quality of the LLM
➜ Your data is less useful for training! (not 100% safe)
5.6 Loss of copyright protection of your work
- U.S. Copyright Office decision: work of AI cannot be protected by copyright
- ➜ Your work, if mixed with such content, could also lose copyright protection!
➜ AI generated content must be separated from other content and declared as such!
5.7 Output might be protected by copyright
- LLMs sometimes create output which resembles (IP-protected: copyright, trademark,...) training data
- ➜ Your work, if mixed with such content, could violate other peoples intellectual property!
➜ You have to check the generated output for IP-violations (difficult)!
5.8 Patent becoming inadmissible
- Inputs to LLM can be used as trainig data and might be output to other users
- ➜ patent application criterium of novelty not satisfied anymore!
- ➜ patent might become inadmissible!
➜ Again: never enter confidential information!
THE END
Thank you for your attention