Fix speaker dialogue lines that were incorrectly merged onto the same line
Fixes collapsed speaker lines by inserting newlines before mid-line speaker labels. This corrects a common LLM error where dialogue turns are merged onto one line instead of being separated.
Speaker labels to recognize when fixing collapsed lines.If not provided, the function will automatically infer labels from the text by finding capitalized words followed by colons that appear 2+ times.Auto-inference pattern: Matches 1-3 capitalized words ending with a colon (e.g., "Questioner:", "The Shaykh:", "Shaykh Ibn Bāz:")Example: ["Questioner", "The Shaykh", "Mu'adhdhin"]
Punctuation tokens that may appear before a collapsed speaker label.When inserting a newline before a speaker label, the function preserves trailing punctuation from the previous sentence.Default: ['.', '?', '!', '…', '،', '؛', ':', ':', '-', '–', '—']Example: "The ruling is permissible. The Shaykh: Another point..." becomes:
The ruling is permissible.The Shaykh: Another point...
import { fixCollapsedSpeakerLines } from 'wobble-bibble';const malformed = `P1 - Questioner: What is the ruling? The Shaykh: The ruling is permissible.`;const result = fixCollapsedSpeakerLines(malformed, { speakerLabels: ['Questioner', 'The Shaykh']});console.log(result.text);// P1 - Questioner: What is the ruling?// The Shaykh: The ruling is permissible.console.log(result.counts);// { fixCollapsedSpeakerLines: 1 }
const text = `P1 - Questioner: First question?P2 - The Shaykh: First answer. Questioner: Follow-up question?P3 - The Shaykh: Final answer.`;// No config provided - labels will be inferredconst result = fixCollapsedSpeakerLines(text);console.log(result.text);// P1 - Questioner: First question?// P2 - The Shaykh: First answer.// Questioner: Follow-up question?// P3 - The Shaykh: Final answer.
const text = `P1 - The answer is yes. The Shaykh: However, there are conditions.`;const result = fixCollapsedSpeakerLines(text, { speakerLabels: ['The Shaykh'], leadingPunctuation: ['.', '?', '!']});console.log(result.text);// P1 - The answer is yes.// The Shaykh: However, there are conditions.// Note: The period is preserved before the newline
const wellFormatted = `P1 - Questioner: What is the ruling?P2 - The Shaykh: The ruling is permissible.`;const result = fixCollapsedSpeakerLines(wellFormatted);console.log(result.applied);// []console.log(result.counts);// { fixCollapsedSpeakerLines: 0 }console.log(result.text === wellFormatted);// true (text unchanged)
If no speaker labels are found (either provided or inferred), the function returns the original text unchanged:
const text = `P1 - Some text without any speaker labels.`;const result = fixCollapsedSpeakerLines(text);// No labels found, nothing to fixconsole.log(result.text === text);// trueconsole.log(result.counts);// { fixCollapsedSpeakerLines: 0 }