Skip to content

dwulff/LLM4BeSci_2025MetaRep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM4BeSci at MetaRep, MPIB Berlin, Dec 2025

cover image

The course introduces the use of open-source large language models (LLMs) from the Hugging Face ecosystem for research in the behavioral and social sciences.

Help by Dirk Wulff
Materials by Zak Hussain, Taisiia Tikhomirova, Valentin Kriegmair, and Dirk Wulff

Schedule

Day 1

02:00 PM - 02:45 AM: Welcome & Intro
02:30 PM - 03:45 PM: Talk: Intro to LLMs
03:45 PM - 04:00 PM: Break
04:00 PM - 04:30 PM: Talk: A gentle intro to Hugging Face and Python
04:30 PM - 04:45 PM: Setup Colab
04:45 PM - 05:30 PM: Exercise: Running pipelines
05:30 PM - 06:00 PM: Walkthrough
07:00 PM: Dinner (self-paid)

Day 2

09:30 AM - 10:00 AM: Recap quiz
10:00 AM - 11:00 AM: Talk: Intro to transformers & embeddings
10:45 AM - 11:00 AM: Break
11:00 AM - 12:00 PM: Talk: Intro to transformers & embeddings (continued)
12:00 PM - 01:00 PM: Discussion: Find applications in small groups
01:00 PM - 02:00 PM: Lunch
02:00 PM - 03:00 PM: Exercise: Clarifying personality psychology
03:00 PM - 03:30 PM: Walkthrough
03:30 PM - 03:45 PM: Break
03:45 PM - 04:45 PM: Intro to classification and regression
04:15 PM - 05:15 PM: Exercise: Classifying media bias (combination of 2 classification a and b)
05:15 PM - 06:00 PM: Walkthrough
07:00 PM: Dinner (self-paid)

Day 3

09:30 AM - 10:30 AM: Recap quiz
10:30 AM - 11:00 AM: Talk: Intro to info extraction
11:00 AM - 11:15 AM: Break
11:10 AM - 12:00 PM: Exercise: Info extraction from articles (see also Outlines example)
12:00 PM - 01:00 PM: Discussion: Find applications in small groups
01:00 PM - 02:00 PM: Lunch
02:00 PM - 03:00 PM: Talk: Research applications
03:00 PM - 04:00 PM: Open questions

Resources

Hussain, Binz, Mata, & Wulff (2024). A tutorial on open-source large language models for behavioral science. Behavior Research Methods, 1-24.

@article{hussain2024tutorial,
  title={A tutorial on open-source large language models for behavioral science},
  author={Hussain, Zak and Binz, Marcel and Mata, Rui and Wulff, Dirk U},
  journal={Behavior Research Methods},
  pages={1--24},
  year={2024},
  publisher={Springer}
}

Hugging face documentation
Hugging face book
But what is a GPT (3Blue1Brown)

Before the course

Hugging Face and Meta Llama License

  1. Make sure you have a Hugging Face account (https://huggingface.co/join).
  2. Go to the meta-llama/Llama-3.2-3B-Instruct model page and fill in the 'COMMUNITY LICENSE AGREEMENT' form at the top of the page to get access to the model (this may take a day or so).

Google Colab and GitHub Repository

  1. If you do not have a Google account, you will need to create one (this can be deleted after the workshop).

During the course

  1. Navigate to Google Drive (https://drive.google.com/).
  2. In the top-left, click New > More > Colaboratory. If you do not see Colaboratory, you may need to click "Connect more apps", search for 'Colaboratory,' and install it. Then click New > More > Colaboratory.
  3. Copy the following code snippet into the first cell of the notebook. Run it (shift + enter or click ► button) to mount your Google Drive to the Colab environment. A pop-up will ask you to connect; click through the steps to connect your Google Drive to Colab (unfortunately, this only works if you give Google full permissions).
from google.colab import drive
drive.mount("/content/drive")
  1. Create a second cell in your notebook using the "+ Code" button that appears when you hover your cursor right under the first cell. Copy and run the following code snippet in the second cell of your notebook to clone the GitHub repository to your Google Drive:
%cd /content/drive/MyDrive
!git clone https://github.com/dwulff/LLM4BeSci_2025MetaRep
  1. Go back to your Google Drive and navigate to the folder "dwulff/LLM4BeSci_2025MetaRep" (this is your own copy, so you can edit it how you like, and the changes won't affect anyone else's copies). You should see the directories day_1, day_2, day_3 containing the relevant notebooks (.ipynb files) and data (it may take a couple of minutes for the files to appear) for each day's exercises.

You have now successfully set up your Google Colab environment and cloned the GitHub repository!

You are now ready to work through all the exercises in the course!

Packages

No packages published

Contributors 2

  •  
  •