Extracting Individual Comments from Yahoo News
Yahoo News comments are extracted using specific selectors on the comments page.
Comment Item Selector
const commentItems = document.querySelectorAll('li.sc-169yn8p-0');
Each li.sc-169yn8p-0 element contains one comment.
Comment Structure
Each comment contains:
- Username (optional)
- Comment text
- Reactions (empathy, insightful, disagree)
- Reply count (optional)
Extraction Code
const extractedComments = [];
commentItems.forEach((item) => {
const commentArticle = item.querySelector('article.sc-169yn8p-3');
if (!commentArticle) return;
// Extract username
let username = '';
const usernameLink = commentArticle.querySelector('a[href*="/users/"]');
if (usernameLink) {
const img = usernameLink.querySelector('img');
username = img?.getAttribute('alt') || '';
}
// Extract comment text
let commentText = '';
const paragraphs = commentArticle.querySelectorAll('p');
for (const p of Array.from(paragraphs)) {
const text = p.textContent?.trim() || '';
// Filter out UI elements
if (text.length > 20 &&
!text.includes('このコメントを削除') &&
!text.includes('非表示・報告') &&
!text.includes('共感した') &&
!text.includes('なるほど') &&
!text.includes('うーん')) {
commentText = text;
break;
}
}
if (!commentText || commentText.length < 10) return;
extractedComments.push({
text: commentText,
username: username || undefined,
});
});
Filtering Rules
- Skip comments with text < 10 characters
- Exclude UI button text (reactions, delete, report)
- First qualifying paragraph is used as comment text