This blog covers my project for AI Safety Fundamentals Alignment Course, showing how steering vectors created with using Sparse Auto-Encoders can affect a models generated output.
Share this post
Using an SAE as a Steering Vector
Share this post
This blog covers my project for AI Safety Fundamentals Alignment Course, showing how steering vectors created with using Sparse Auto-Encoders can affect a models generated output.