Clean & Extract Product Descriptions from HTML (~5,000 Rows)

Замовник: AI | Опубліковано: 11.04.2026
Бюджет: 250 $

I have a spreadsheet (~5,000 product rows) where each row contains a full HTML eBay listing template. Each row includes: ID SKU Description Short description The Description field contains a large block of HTML (decorative listing template), but the actual product description is embedded inside it. Your job is to extract the correct text. 1. Extract the Correct Description In every row, the real product description is located inside this HTML block: <div class="desc-rd desc-text"> Requirements: Extract only the content inside this div Ignore all other HTML content in the row (menus, images, headers, shipping info, footer, etc.) Do not use the rest of the HTML outside this block 2. Clean the Extracted Text The content inside the div typically contains HTML such as: <p> tags <span> tags Requirements: Remove all HTML tags Preserve paragraph structure: Each <p> should become a new line <br> should become a new line Output clean, readable plain text Example result format: Paragraph 1 (blank line) Paragraph 2 3. Remove Trailing Inventory Tags Each description ends with an internal tag such as: BTG-6772 BTG-10284 Requirements: Remove this tag from the final text Clean up any leftover spacing 4. Final Output Write the cleaned text into the Description column Completely clear the Short description column for all rows Do not modify the SKU column Deliverable One cleaned Excel file with: Cleaned Description column Empty Short description column Requirements Must be completed programmatically (Python preferred) Experience parsing HTML (e.g., BeautifulSoup or similar) Strong attention to detail