Summarization GRPO idea

21 May, 2025

Ight so Gosling wants to train a whole new model for summarization? i say nah.

Idea: Take long ctx RP logs (16~K ctx) logs from sources like https://huggingface.co/datasets/PocketDoc/Dans-Personamaxx-Logs or smth

and then using sentence transformers to compare a model's output (summary) of that log and compare it against the OG text

apparently

there's a way to do that for extractive vs abstractive summarization models. but my friend hasn't found the link. so smth like embeddings to compare the 2 texts for now...

anyway the verifier in this case is the embedding, if the summary is good and the embedding says so, reward dat hoe, if not, shoot dat hoe.

I'll probably do this later... but i really wanna do it......................................

for now i work on Sol-Revear, my new line of models..... and it's going... shittly