Reinforcement Learning from Human Feedback

Software > Computer Software > Educational Software DeepLearning.AI

Course Overview

What You'll Learn

  • Reinforcement Learning from Human Feedback (RLHF) is currently the main method for aligning LLMs with human values and preferences.
  • In this course, you will gain a conceptual understanding of the RLHF training process, and then practice applying RLHF to tune an LLM.

Large language models (LLMs) are trained on human-generated text, but additional methods are needed to align an LLM with human values and preferences. Reinforcement Learning from Human Feedback (RLHF) is currently the main method for aligning LLMs with human values and preferences. RLHF is also used for further tuning a base LLM to align with values and preferences that are specific to your use case. In this course, you will gain a conceptual understanding of the RLHF training process, and then practice applying RLHF to tune an LLM. You will: 1. Explore the two datasets that are used in RLHF training: the “preference” and “prompt” datasets. 2. Use the open source Google Cloud Pipeline Components Library, to fine-tune the Llama 2 model with RLHF. 3. Assess the tuned LLM against the original base model by comparing loss curves and using the “Side-by-Side (SxS)” method.

Course FAQs

Is this an accredited online course?

Accreditation for 'Reinforcement Learning from Human Feedback' is determined by the provider, DeepLearning.AI. For online college courses or degree programs, we strongly recommend you verify the accreditation status directly on the provider's website to ensure it meets your requirements.

Can this course be used for continuing education credits?

Many of the courses listed on our platform are suitable for professional continuing education. However, acceptance for credit varies by state and licensing board. Please confirm with your board and {course.provider} that this specific course qualifies.

How do I enroll in this online school program?

To enroll, click the 'ENROLL NOW' button on this page. You will be taken to the official page for 'Reinforcement Learning from Human Feedback' on the DeepLearning.AI online class platform, where you can complete your registration.