File size: 7,311 Bytes
5fdb69e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "aa629e55-8f41-41ab-b319-b55dd1cfc76b",
   "metadata": {},
   "source": [
    "# Playwright Scraper Showcase (Async in Jupyter)\n",
    "\n",
    "This notebook demonstrates how to run async Playwright-based scraping code inside JupyterLab using `nest_asyncio`.\n",
    "\n",
    "**Note:** Requires `openai_scraper_playwright.py` to be in the same directory."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "97469777",
   "metadata": {},
   "outputs": [],
   "source": [
    "import nest_asyncio\n",
    "import asyncio\n",
    "nest_asyncio.apply()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "6254fa89",
   "metadata": {},
   "outputs": [],
   "source": [
    "from openai_scraper_playwright import EnhancedOpenAIScraper, analyze_content"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "33d2737b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "### 1. Overall Summary of the Website:\n",
      "The website appears to be a hub for various applications of AI technology, particularly focusing on the capabilities of ChatGPT and other AI models developed by OpenAI. It offers a range of services from answering queries, assisting in planning trips, explaining technical topics, helping with language translation, and providing educational content. The site also features updates on new AI models, research publications, and business solutions integrating AI.\n",
      "\n",
      "### 2. Key Individuals or Entities:\n",
      "- **OpenAI**: Mentioned as the organization behind the development of AI models and technologies such as ChatGPT, GPT-4.1, and image generation models. OpenAI seems to be focused on advancing and applying AI in various fields.\n",
      "- **Lyndon Barrois & Sora**: Featured in a story, possibly highlighting individual experiences or contributions within the OpenAI ecosystem.\n",
      "\n",
      "### 3. Recent Announcements or Updates:\n",
      "- **Introducing our latest image generation model in the API** (Product, Apr 23, 2025)\n",
      "- **Thinking with images** (Release, Apr 16, 2025)\n",
      "- **OpenAI announces nonprofit commission advisors** (Company, Apr 15, 2025)\n",
      "- **Our updated Preparedness Framework** (Publication, Apr 15, 2025)\n",
      "- **BrowseComp: a benchmark for browsing agents** (Publication, Apr 10, 2025)\n",
      "- **OpenAI Pioneers Program** (Company, Apr 9, 2025)\n",
      "\n",
      "### 4. Main Topics or Themes:\n",
      "- **AI Model Development and Application**: Discusses various AI models like ChatGPT, GPT-4.1, and image generation models.\n",
      "- **Educational and Practical AI Uses**: Offers help in educational topics, practical tasks, and creative endeavors using AI.\n",
      "- **Business Integration**: Focuses on integrating AI into business processes, automating tasks in finance, legal, and other sectors.\n",
      "- **Research and Publications**: Shares updates on the latest research and publications related to AI technology.\n",
      "\n",
      "### 5. Any Noteworthy Features or Projects:\n",
      "- **GPT-4.1 and Image Generation Models**: Introduction of new and advanced AI models for text and image processing.\n",
      "- **OpenAI Pioneers Program**: A significant initiative likely aimed at fostering innovation and practical applications of AI technology.\n",
      "- **BrowseComp and PaperBench**: Research projects or benchmarks designed to evaluate and improve AI capabilities in specific domains.\n"
     ]
    }
   ],
   "source": [
    "result = asyncio.run(analyze_content())\n",
    "print(result)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d7450ccf",
   "metadata": {},
   "source": [
    "βœ… If you see structured analysis above, the async code ran successfully in Jupyter!"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9a46716c-6f77-4b2b-b423-cc9fe05014da",
   "metadata": {},
   "source": [
    "# πŸ§ͺ Playwright Scraper Output (Formatted)\n",
    "\n",
    "---\n",
    "\n",
    "## 🧭 1. **Overall Summary of the Website**\n",
    "\n",
    "*The website appears to be focused on showcasing various applications and updates related to OpenAI's technology, specifically ChatGPT and other AI tools. It provides information on product releases, company updates, and educational content on how to use AI technologies in different scenarios such as planning trips, learning games, coding, and more.*\n",
    "\n",
    "---\n",
    "\n",
    "## πŸ§‘β€πŸ’Ό 2. **Key Individuals or Entities**\n",
    "\n",
    "- **OpenAI** β€” Company behind the technologies and updates discussed on the website  \n",
    "- **Lyndon Barrois & Sora** β€” Featured in a story, possibly highlighting user experiences or contributions\n",
    "\n",
    "---\n",
    "\n",
    "## πŸ“° 3. **Recent Announcements or Updates**\n",
    "\n",
    "- πŸ“’ **Introducing GPT-4.1 in the API** β€” *(no date provided)*\n",
    "- πŸ–ΌοΈ **Introducing 4o Image Generation** β€” *(no date provided)*\n",
    "- 🐟 **Catching halibut with ChatGPT** β€” *(no date provided)*\n",
    "- 🧠 **Thinking with images** β€” *Apr 16, 2025*\n",
    "- πŸ§‘β€βš–οΈ **Nonprofit commission advisors announced** β€” *Apr 15, 2025*\n",
    "- βš™οΈ **Updated Preparedness Framework** β€” *Apr 15, 2025*\n",
    "- 🌐 **BrowseComp benchmark for browsing agents** β€” *Apr 10, 2025*\n",
    "- πŸš€ **OpenAI Pioneers Program launched** β€” *Apr 9, 2025*\n",
    "- πŸ“Š **PaperBench research benchmark published** β€” *Apr 2, 2025*\n",
    "\n",
    "---\n",
    "\n",
    "## πŸ“š 4. **Main Topics or Themes**\n",
    "\n",
    "- πŸ€– **AI Technology Applications** β€” Using AI for tasks like planning, learning, and troubleshooting  \n",
    "- 🧩 **Product and Feature Releases** β€” Updates on new capabilities  \n",
    "- πŸ“˜ **Educational Content** β€” Guides for using AI effectively  \n",
    "- πŸ§ͺ **Research and Development** β€” Publications and technical benchmarks\n",
    "\n",
    "---\n",
    "\n",
    "## ⭐ 5. **Noteworthy Features or Projects**\n",
    "\n",
    "- βœ… **GPT-4.1** β€” A new API-accessible version of the language model  \n",
    "- πŸ–ΌοΈ **4o Image Generation** β€” Feature focused on AI-generated images  \n",
    "- πŸš€ **OpenAI Pioneers Program** β€” Initiative likely fostering innovation in AI  \n",
    "- πŸ“Š **BrowseComp & PaperBench** β€” Benchmarks for evaluating AI agents\n",
    "\n",
    "---\n",
    "\n",
    "βœ… *If you're reading this inside Jupyter and seeing clean structure β€” your async notebook setup is working beautifully.*\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "95c38374-5daa-487c-8bd9-919bb4037ea3",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}