Multi-Platform Crawl & LLM Insights

I need a compact Python crawler that pulls public content from Twitter, Instagram and LinkedIn, covering text, image and video posts for any handle I feed it. Here’s the flow I have in mind. The script collects the raw post data (caption, hashtags, basic engagement numbers and, where accessible, image/video URLs) through whichever mix of libraries makes sense—Tweepy or Twitter API v2 for Twitter, Instaloader or Selenium for Instagram, and the official or unofficial LinkedIn API for LinkedIn. After normalising everything into a common JSON schema, the crawler should pass that dataset to an LLM endpoint (OpenAI or similar) and receive back a concise, structured report that includes: • Brand sentiment (positive / neutral / negative trends) • Key thematic buckets the brand talks about • Audience-engagement highlights such as most-reacted posts, average comment tone and any spikes The end product I’m expecting is: 1. Well-commented Python code with a requirements.txt. 2. A .env-based config for keys, rate-limits and the LLM endpoint. 3. A sample run (readme + Jupyter notebook or plain script) that outputs the JSON dump and the LLM-generated insight report. If the APIs hit a wall, graceful fall-back through headless browsing is fine so long as it stays within the target platforms’ terms of service. Accuracy of the scraped metrics and clarity of the LLM output will be my primary acceptance criteria.

Python

Реєстрація